mirror of
				https://github.com/c64scene-ar/llvm-6502.git
				synced 2025-10-31 08:16:47 +00:00 
			
		
		
		
	PTX: Add basic documentation to CodeGenerator.html
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137315 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
		| @@ -114,6 +114,7 @@ | |||||||
|       <li><a href="#ppc_prolog">Prolog/Epilog</a></li> |       <li><a href="#ppc_prolog">Prolog/Epilog</a></li> | ||||||
|       <li><a href="#ppc_dynamic">Dynamic Allocation</a></li> |       <li><a href="#ppc_dynamic">Dynamic Allocation</a></li> | ||||||
|       </ul></li> |       </ul></li> | ||||||
|  |     <li><a href="#ptx">The PTX backend</a></li> | ||||||
|     </ul></li> |     </ul></li> | ||||||
|  |  | ||||||
| </ol> | </ol> | ||||||
| @@ -2912,6 +2913,70 @@ MOVSX32rm16 -> movsx, 32-bit register, 16-bit memory | |||||||
|  |  | ||||||
| </div> | </div> | ||||||
|  |  | ||||||
|  | </div> | ||||||
|  |  | ||||||
|  | <!-- ======================================================================= --> | ||||||
|  | <h3> | ||||||
|  |   <a name="ptx">The PTX backend</a> | ||||||
|  | </h3> | ||||||
|  |  | ||||||
|  | <div> | ||||||
|  |  | ||||||
|  | <p>The PTX code generator lives in the lib/Target/PTX directory. It is | ||||||
|  |   currently a work-in-progress, but already supports most of the code | ||||||
|  |   generation functionality needed to generate correct PTX kernels for | ||||||
|  |   CUDA devices.</p> | ||||||
|  |  | ||||||
|  | <p>The code generator can target PTX 2.0+, and shader model 1.0+.  The | ||||||
|  |   PTX ISA Reference Manual is used as the primary source of ISA | ||||||
|  |   information, though an effort is made to make the output of the code | ||||||
|  |   generator match the output of the NVidia nvcc compiler, whenever | ||||||
|  |   possible.</p> | ||||||
|  |  | ||||||
|  | <p>Code Generator Options:</p> | ||||||
|  | <table border="1" cellspacing="0"> | ||||||
|  |   <tr> | ||||||
|  |     <th>Option</th> | ||||||
|  |     <th>Description</th> | ||||||
|  |  </tr> | ||||||
|  |    <tr> | ||||||
|  |      <td><code>double</code></td> | ||||||
|  |      <td align="left">If enabled, the map_f64_to_f32 directive is | ||||||
|  |        disabled in the PTX output, allowing native double-precision | ||||||
|  |        arithmetic</td> | ||||||
|  |   </tr> | ||||||
|  |   <tr> | ||||||
|  |     <td><code>no-fma</code></td> | ||||||
|  |     <td align="left">Disable generation of Fused-Multiply Add | ||||||
|  |       instructions, which may be beneficial for some devices</td> | ||||||
|  |   </tr> | ||||||
|  |   <tr> | ||||||
|  |     <td><code>smxy / computexy</code></td> | ||||||
|  |     <td align="left">Set shader model/compute capability to x.y, | ||||||
|  |     e.g. sm20 or compute13</td> | ||||||
|  |   </tr> | ||||||
|  | </table> | ||||||
|  |  | ||||||
|  | <p>Working:</p> | ||||||
|  | <ul> | ||||||
|  |   <li>Arithmetic instruction selection (including combo FMA)</li> | ||||||
|  |   <li>Bitwise instruction selection</li> | ||||||
|  |   <li>Control-flow instruction selection</li> | ||||||
|  |   <li>Function calls (only on SM 2.0+ and no return arguments)</li> | ||||||
|  |   <li>Addresses spaces (0 = global, 1 = constant, 2 = local, 4 = | ||||||
|  |   shared)</li> | ||||||
|  |   <li>Thread synchronization (bar.sync)</li> | ||||||
|  |   <li>Special register reads ([N]TID, [N]CTAID, PMx, CLOCK, etc.)</li> | ||||||
|  | </ul> | ||||||
|  |  | ||||||
|  | <p>In Progress:</p> | ||||||
|  | <ul> | ||||||
|  |   <li>Robust call instruction selection</li> | ||||||
|  |   <li>Stack frame allocation</li> | ||||||
|  |   <li>Device-specific instruction scheduling optimizations</li> | ||||||
|  | </ul> | ||||||
|  |  | ||||||
|  |  | ||||||
| </div> | </div> | ||||||
|  |  | ||||||
| </div> | </div> | ||||||
|   | |||||||
		Reference in New Issue
	
	Block a user