Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
Next revisionBoth sides next revision
fpga_architecture_for_computing [2019/09/30 09:41] – [Network Function Acceleration] editfpga_architecture_for_computing [2020/03/05 11:36] – [Network Function Acceleration] edit
Line 27: Line 27:
 ====Network Function Acceleration==== ====Network Function Acceleration====
  
-We begin our investigation my studying FPGA acceleration of Intrusion Detection System (IDS). Today’s state of the art IDS are software-based and cannot cost- or power-efficiently keep-up with increasing network speed.  FPGA accelerators are promising as efficient high performance hardware alternative to software IDS and retain software’s programmability.  We are currently working on an FPGA accelerated SNORT IDS that uses FPGA to handle the common cases at network speed (100Gbpsand only offload a very small fraction of exceptional cases to CPU.  Future work is to create a high-level domain-specific NF programming framework for use by networking experts who are not RTL experts.  This is joint work with [[http://www.justinesherry.com/ |Justine Sherry]] and [[https://users.ece.cmu.edu/~vsekar/ | Vyas Sekar]]. +We begin our investigation by studying FPGA acceleration of Intrusion Detection System (IDS) and Intrusion Prevention System (IPS).  
-====CoRAM====+ 
 +Today’s state-of-the-art software intrusion detection systems (IDS) can process about 1 Gbps per high-end CPU core.  This is an untenable starting point for an intrusion prevention system (IPS) that can operate inline with today’s 100Gbps links and respond in time to stop a malicious packet from propagating.  In Project Pigasus, we are accelerating a 10K-rule IPS to 100 Gbps on one Stratix-10 MX FPGA.  The effort required a very different design mindset from the software approach.  We have been able to achieve 100Gbps by making effective use the Stratix-10 MX FPGA’s fast on-chip SRAM. For TCP reassembly, Pigasus uses dynamic allocation to compactly store packets in on-chip SRAM. Pigasus implements multistring “fast pattern” matching using small SRAM hash tables combined with another light-weight filter. In the current system, only the (not-yet-acceleratedfull matching stage is offloaded to CPU cores in a stateless fashion.   
 + 
 +We are looking to generalize the framework and components from the Pigasus IPS effort to other in-network compute opportunities.  Future work is also to create a high-level domain-specific NF programming framework for use by networking experts who are not RTL experts.  This is joint work with [[http://www.justinesherry.com/ |Justine Sherry]] and [[https://users.ece.cmu.edu/~vsekar/ | Vyas Sekar]]. 
 + 
 + 
 + 
 +====CoRAM (Classic)====
 Our investigation into FPGA architecture for computing began in 2009 with the question: how should data-intensive FPGA compute kernels view and interact with external memory data.  In response, we developed the original CoRAM FPGA computing abstraction.  The goal of the CoRAM abstraction is to present the application developer with (1) a virtualized appearance of the FPGA’s resources (i.e., reconfigurable logic, external memory interfaces, and on-chip SRAMs) to hide low-level, non-portable platform-specific details, and (2) standardized, easy-to-use high-level interfaces for controlling data movements between the memory interfaces and the in-fabric computation kernels.   Besides simplifying application development, the virtualization and standardization of the CoRAM abstraction also make possible portable and scalable application development. Our investigation into FPGA architecture for computing began in 2009 with the question: how should data-intensive FPGA compute kernels view and interact with external memory data.  In response, we developed the original CoRAM FPGA computing abstraction.  The goal of the CoRAM abstraction is to present the application developer with (1) a virtualized appearance of the FPGA’s resources (i.e., reconfigurable logic, external memory interfaces, and on-chip SRAMs) to hide low-level, non-portable platform-specific details, and (2) standardized, easy-to-use high-level interfaces for controlling data movements between the memory interfaces and the in-fabric computation kernels.   Besides simplifying application development, the virtualization and standardization of the CoRAM abstraction also make possible portable and scalable application development.