You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

158 lines
5.3 KiB

  1. <!DOCTYPE html
  2. PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
  3. <!-- saved from url=(0014)about:internet -->
  4. <html xmlns:MSHelp="http://www.microsoft.com/MSHelp/" lang="en-us" xml:lang="en-us"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  5. <meta name="DC.Type" content="topic">
  6. <meta name="DC.Title" content="Elementwise">
  7. <meta name="DC.subject" content="Elementwise">
  8. <meta name="keywords" content="Elementwise">
  9. <meta name="DC.Relation" scheme="URI" content="../../tbb_userguide/Design_Patterns/Design_Patterns.htm">
  10. <meta name="DC.Relation" scheme="URI" content="Agglomeration.htm#Agglomeration">
  11. <meta name="DC.Relation" scheme="URI" content="Reduction.htm#Reduction">
  12. <meta name="DC.Format" content="XHTML">
  13. <meta name="DC.Identifier" content="Elementwise">
  14. <link rel="stylesheet" type="text/css" href="../../intel_css_styles.css">
  15. <title>Elementwise</title>
  16. <xml>
  17. <MSHelp:Attr Name="DocSet" Value="Intel"></MSHelp:Attr>
  18. <MSHelp:Attr Name="Locale" Value="kbEnglish"></MSHelp:Attr>
  19. <MSHelp:Attr Name="TopicType" Value="kbReference"></MSHelp:Attr>
  20. </xml>
  21. </head>
  22. <body id="Elementwise">
  23. <!-- ==============(Start:NavScript)================= -->
  24. <script src="..\..\NavScript.js" language="JavaScript1.2" type="text/javascript"></script>
  25. <script language="JavaScript1.2" type="text/javascript">WriteNavLink(2);</script>
  26. <!-- ==============(End:NavScript)================= -->
  27. <a name="Elementwise"><!-- --></a>
  28. <h1 class="topictitle1">Elementwise</h1>
  29. <div>
  30. <div class="section"><h2 class="sectiontitle">Problem</h2>
  31. <p>Initiate similar independent computations across items in a data set,
  32. and wait until all complete.
  33. </p>
  34. </div>
  35. <div class="section"><h2 class="sectiontitle">Context</h2>
  36. <p>Many serial algorithms sweep over a set of items and do an independent
  37. computation on each item. However, if some kind of summary information is
  38. collected, use the Reduction pattern instead.
  39. </p>
  40. </div>
  41. <div class="section"><h2 class="sectiontitle">Forces</h2>
  42. <p>No information is carried or merged between the computations.
  43. </p>
  44. </div>
  45. <div class="section"><h2 class="sectiontitle">Solution</h2>
  46. <p>If the number of items is known in advance, use
  47. <samp class="codeph">tbb::parallel_for</samp>. If not, consider using
  48. <samp class="codeph">tbb::parallel_do</samp>.
  49. </p>
  50. <p>Use agglomeration if the individual computations are small relative to
  51. scheduler overheads.
  52. </p>
  53. <p>If the pattern is followed by a reduction on the same data, consider
  54. doing the element-wise operation as part of the reduction, so that the
  55. combination of the two patterns is accomplished in a single sweep instead of
  56. two sweeps. Doing so may improve performance by reducing traffic through the
  57. memory hierarchy.
  58. </p>
  59. </div>
  60. <div class="section"><h2 class="sectiontitle">Example</h2>
  61. <p>Convolution is often used in signal processing. The convolution of a
  62. filter
  63. <var>c</var> and signal
  64. <var>x</var> is computed as:
  65. </p>
  66. <br><img width="99" height="29" src="Images/image004.jpg"><br>
  67. <p>Serial code for this computation might look like:
  68. </p>
  69. <pre>// Assumes c[0..clen-1] and x[1-clen..xlen-1] are defined
  70. for( int i=0; i&lt;xlen+clen-1; ++i ) {
  71. float tmp = 0;
  72. for( int j=0; j&lt;clen; ++j )
  73. tmp += c[j]*x[i-j];
  74. y[i] = tmp;
  75. }</pre>
  76. <p>For simplicity, the fragment assumes that
  77. <samp class="codeph">x</samp> is a pointer into an array padded with zeros such
  78. that
  79. <samp class="codeph">x[k]</samp>returns zero when
  80. <samp class="codeph">k&lt;0</samp> or
  81. <samp class="codeph">k≥xlen</samp>.
  82. </p>
  83. <p>The inner loop does not fit the elementwise pattern, because each
  84. iteration depends on the previous iteration. However, the outer loop fits the
  85. elementwise pattern. It is straightforward to render it using
  86. <samp class="codeph">tbb::parallel_for</samp> as shown:
  87. </p>
  88. <pre>tbb::parallel_for( 0, xlen+clen-1, [=]( int i ) {
  89. float tmp = 0;
  90. for( int j=0; j&lt;clen; ++j )
  91. tmp += c[j]*x[i-j];
  92. y[i] = tmp;
  93. });</pre>
  94. <p><samp class="codeph">tbb::parallel_for</samp> does automatic agglomeration by
  95. implicitly using <samp class="codeph">tbb::auto_partitioner</samp> in its underlying
  96. implementation. If there is reason to agglomerate explicitly, use the overload
  97. of
  98. <samp class="codeph">tbb::parallel_for</samp> that takes an explicit range
  99. argument. The following shows the example transformed to use the overload.
  100. </p>
  101. <pre>tbb::parallel_for(
  102. tbb::blocked_range&lt;int&gt;(0,xlen+clen-1,1000),
  103. [=]( tbb::blocked_range&lt;int&gt; r ) {
  104. int end = r.end();
  105. for( int i=r.begin(); i!=end; ++i ) {
  106. float tmp = 0;
  107. for( int j=0; j&lt;clen; ++j )
  108. tmp += c[j]*x[i-j];
  109. y[i] = tmp;
  110. }
  111. }
  112. );</pre>
  113. <p>&nbsp;
  114. </p>
  115. </div>
  116. </div>
  117. <div class="familylinks">
  118. <div class="parentlink"><strong>Parent topic:</strong>&nbsp;<a href="../../tbb_userguide/Design_Patterns/Design_Patterns.htm">Design Patterns</a></div>
  119. </div>
  120. <div class="See Also">
  121. <h2>See Also</h2>
  122. <div class="linklist">
  123. <div><a href="Agglomeration.htm#Agglomeration">Agglomeration
  124. </a></div>
  125. <div><a href="Reduction.htm#Reduction">Reduction
  126. </a></div></div>
  127. </div>
  128. </body>
  129. </html>