tag:blogger.com,1999:blog-6333115279521503642024-03-05T00:15:35.909-08:00Java Performance and ScaleSourabh Ghosehttp://www.blogger.com/profile/12406723074994150793noreply@blogger.comBlogger5125tag:blogger.com,1999:blog-633311527952150364.post-90179229032928248692013-09-25T23:03:00.002-07:002013-09-28T12:32:49.506-07:00Concurrent Merge sorting of large number of different types of files in Java<script type="text/javascript">var dzone_url = '[url]';</script>
<script type="text/javascript">var dzone_title = '[title]';</script>
<script type="text/javascript">var dzone_blurb = '[description]';</script>
<script type="text/javascript">var dzone_style = '2';</script>
<script language="javascript" src="http://widgets.dzone.com/links/widgets/zoneit.js"></script>
<div dir="ltr" style="text-align: left;" trbidi="on">
<div class="MsoNormal">
I recently worked on a customer project, part of which
required sorting and merging multiple large files based on a timestamp field in
the files. These files were Call data records (CDR) files which are basically
CSV files containing usage records for all subscribers of 2G, 3G and GPRS for
each day. The total number of files was about 2000 with varied number of 2G, 3G
and GPRS files. The total size was roughly 200 Gb.<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Although this can be done using BigMemory, the customer wanted
a quick and dirty way of doing this. <o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Each of the files had a particular naming convention that
would help understand what type of file it is. The 1<sup>st</sup> 2 or 3
characters enable us to understand what type of file it is. Ex. File names –<o:p></o:p></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="mso-bidi-font-family: Calibri;">3G</span><span style="mso-bidi-font-family: 新細明體; mso-fareast-font-family: 新細明體;"><o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="mso-bidi-font-family: Calibri; mso-fareast-font-family: 新細明體;">'GMI02A_13082900_0183'</span><span style="mso-bidi-font-family: 新細明體; mso-fareast-font-family: 新細明體;"><o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="mso-bidi-font-family: Calibri; mso-fareast-font-family: 新細明體;">'GMI02A_13082900_0184'</span><span style="mso-bidi-font-family: 新細明體; mso-fareast-font-family: 新細明體;"><o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="mso-bidi-font-family: Calibri; mso-fareast-font-family: 新細明體;"> </span><span style="mso-bidi-font-family: 新細明體; mso-fareast-font-family: 新細明體;"><o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="mso-bidi-font-family: Calibri; mso-fareast-font-family: 新細明體;">2G</span><span style="mso-bidi-font-family: 新細明體; mso-fareast-font-family: 新細明體;"><o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="mso-bidi-font-family: Calibri; mso-fareast-font-family: 新細明體;">'RTPAHLR1_13082700_8853'</span><span style="mso-bidi-font-family: 新細明體; mso-fareast-font-family: 新細明體;"><o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="mso-bidi-font-family: Calibri; mso-fareast-font-family: 新細明體;">'RTPAHLR1_13082701_8854'</span><span style="mso-bidi-font-family: 新細明體; mso-fareast-font-family: 新細明體;"><o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="mso-bidi-font-family: Calibri; mso-fareast-font-family: 新細明體;">'RTPAHLR1_13082702_8855'</span><span style="mso-bidi-font-family: 新細明體; mso-fareast-font-family: 新細明體;"><o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="mso-bidi-font-family: Calibri; mso-fareast-font-family: 新細明體;">'RTPAHLR1_13082703_8856'</span><span style="mso-bidi-font-family: 新細明體; mso-fareast-font-family: 新細明體;"><o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="mso-bidi-font-family: Calibri; mso-fareast-font-family: 新細明體;">'RTPAHLR1_13082704_8857'</span><span style="mso-bidi-font-family: 新細明體; mso-fareast-font-family: 新細明體;"><o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<br /></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="mso-bidi-font-family: Calibri; mso-fareast-font-family: 新細明體;">GPRS</span><span style="mso-bidi-font-family: 新細明體; mso-fareast-font-family: 新細明體;"><o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="mso-bidi-font-family: Calibri; mso-fareast-font-family: 新細明體;">'MOU05_1308270005'</span><span style="mso-bidi-font-family: 新細明體; mso-fareast-font-family: 新細明體;"><o:p></o:p></span></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<span style="mso-bidi-font-family: Calibri; mso-fareast-font-family: 新細明體;">'MOU05_1308270020'</span><span style="mso-bidi-font-family: 新細明體; mso-fareast-font-family: 新細明體;"><o:p></o:p></span></div>
<div class="MsoNormal">
<span style="mso-bidi-font-family: Calibri; mso-fareast-font-family: 新細明體;">'MOU05_1308270035'<o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
As you can see the characters before the first underscore
identify what type of file it is. The characters after the underscore are
timestamps. If we sort and group the files as per these timestamps it is likely
that we will have the same range of timestamps in the file contents. An
interesting thing to note is, there can be multiple files per timestamp, which
is the character after the 2<sup>nd</sup> underscore for 3G and 2G.<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="mso-bidi-font-family: Calibri; mso-fareast-font-family: 新細明體;">Single sample line from each of the CSV was as follows (with dummy
data) –<o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="mso-bidi-font-family: Calibri; mso-fareast-font-family: 新細明體;">2G<o:p></o:p></span></div>
<div class="MsoNormal">
5x944324461a1b18d08d7ab25d9fa1595f,'ABCDFE12_13050112_9998','
','','4','43',' ',' ','','','466974301229529','886983095821','','',' ','
','','','2013-05-01 11:10:58.000',307,' ','0','0',' ','','','<span style="mso-spacerun: yes;"> </span>',' ',' ','','',' ','
','',' ',' ',' ','','1','2','1',’6899066686','','','','1','2','1','6899066686','
',' ','','','<span style="mso-spacerun: yes;"> </span>',' ','
','','','<span style="mso-spacerun: yes;"> </span>','08-0-9-3<span style="mso-spacerun: yes;"> </span>','08-0-9-3<span style="mso-spacerun: yes;"> </span>','<span style="mso-spacerun: yes;"> </span>',' ABCDFE12<span style="mso-spacerun: yes;"> </span>','17','0','<span style="mso-spacerun: yes;">
</span>',' ',' ',0,'IGMRL01','2-15','ONXDEM','7-25','0','','','778158',0,0,'','','
',,'','','',,,'','','','','','','',' ',0,'','','','','','','',' ',' ','','','
','<span style="mso-spacerun: yes;"> </span>',' ','<span style="mso-spacerun: yes;"> </span>','8DCA28F284000007','1','1','1 ','656935***319','','','','','','','
','<span style="mso-spacerun: yes;"> </span>','<span style="mso-spacerun: yes;">
</span>','2013-05-01 11:04:58.000',307,' ','',' ',181,'1','1','1 ','656935***319','
','<span style="mso-spacerun: yes;"> </span>',' ',' ',' ','<span style="mso-spacerun: yes;"> </span>','<span style="mso-spacerun: yes;">
</span>','0','2','0','1','D9110935388202','','<span style="mso-spacerun: yes;">
</span>','','','',' ','',' ','<span style="mso-spacerun: yes;"> </span>','<span style="mso-spacerun: yes;"> </span>',,'','',' ',' ',' ',' ',' ',' ','<span style="mso-spacerun: yes;"> </span>','','','','','653095821','456683539688','6544066686','0823206559','','','','','','','654066686','34222206559','',''<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
GPRS<o:p></o:p></div>
<div class="MsoNormal">
5x944324461a1f18d093bfb25d9fa1595f,'MCCDSC_1305011035','18','0','3','1339655','466974301229529','2013-05-01
10:03:37.000','355026050423934','886983095821','133.22.29.4','31.39.11.321,'11892013','INTERNET','CMSC099.M22dd66.GPRS','1','111.334.23.12','413427','239','58844','1','8
','46697 ','0','1','2',0,0,'02','FF','00','00','00','FE','00','00','00','00','00','4A','00','02','02','93','96','83','FE','74','81','FF','FF','00','00','00','2013-05-01
09:56:37.000','2013-05-01 10:03:03.000',386,'0
','0',0,0,'00','00','00','00','00','00','00','00','00','00','00','00',,,0,'0','','0','0','','','00','0',0,0,0,0,,,,,'5630934411','7331295359688','00','00','00','3','1','1','0','0','0','','
','',' ','','',' ','01306800','Samsung','Galaxy','Handset','1','','','','','','0','','0','0'<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
3G<o:p></o:p></div>
<div class="MsoNormal">
5x944324461a1f18d093ba1595f,'DGCCSD_13050112_2938',1,1,1,1,1,1,1,1,0,2013-05-01
12:55:08.000',0,'2013-05-01
12:48:04.000','11','00','413B','55','0AD8','2013-05-01
12:47:59.000','03','FF','<span style="mso-spacerun: yes;">
</span>','FFFF','','466974104424565','','','','','','','','FF','939644958<span style="mso-spacerun: yes;"> </span>','05','06','5430***660','05','06','65535',65535,,,'65535',,'FF',65535,,,'00','03','FFFF','67822059982110','123974700000830','','0000','0000','0000','0000','0000','0000','07','886983***686','05','05','8488',21111,466,97,'19546',886935***416,'05',21401,466,97,'FFFFFFFFFFFFFFFF',,'FF','FF','00000000','<span style="mso-spacerun: yes;"> </span>','3','8','2013-05-01 12:50:49.000','2013-05-01
12:48:08.000','F5B9','8130***660','05','06','00','<span style="mso-spacerun: yes;"> </span>','FFFF','FF',0,'1233804097','05','04',0,0,0,0,0,0,'886935***374',0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,'<span style="mso-spacerun: yes;"> </span>','2013-05-01
12:47:59.000',,'',,,'00',0,0,0,0,0,0,0,0,0,'413B','55','0AD8','','FF','10','FF','00',0,0,'','<span style="mso-spacerun: yes;"> </span>','<span style="mso-spacerun: yes;">
</span>',,0,0,0,0,0,161,161.11,'01','0',0,'000000','08',,53,'1246',0,0,'FFFF','FFFFF','01','FF',436,62871450,'00','1','00','','<span style="mso-spacerun: yes;"> </span>','<span style="mso-spacerun: yes;">
</span>','FF','FF','FF','FF','FF','FF','FF',0,'2013-05-01
12:47:58.000',2,'00',,,,,,,,,'',,'<span style="mso-spacerun: yes;">
</span>','FFFF','FF','00','FF','FF','00','FF','2013-05-01
12:47:59.000','642***660',’1243408079','823***686','124191389688','121***660','066208079','','','',''<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
The idea was to read 3 files (of each type) at a time and
insert into a shared sorted buffer. Once the reading is done, write out the
sorted buffer into an output file, clear it and move onto the next batch of 3
files. Another requirement for the writer was that, all data should not be merge sorted into a single file, rather should be split into multiple output files. So the File writer thread must keep track of the number of lines written and roll over to the next output file.<o:p></o:p><br />
<br /></div>
<div class="MsoNormal">
The problem with this approach is that not all files will
contain the same timestamp.<o:p></o:p></div>
<div class="MsoNormal">
Ex. 2G might be till timestamp T5, 3G till timestamp T7 and
GPRS till timestamp T4.<o:p></o:p></div>
<div class="MsoNormal">
In this case we can only flush data till timestamp T4 and
retain all the data after that in the buffer since the next batch of files
might contain those timestamps to be sorted. This puts additional memory
pressure.<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
The approach I used was to do a modified concurrent external
merge sort. <o:p></o:p></div>
<ol start="1" type="1">
<li class="MsoNormal"><span style="mso-bidi-font-family: "Times New Roman"; mso-fareast-font-family: "Times New Roman";"><span style="color: #cccccc;">Based on
producer consumer pattern<o:p></o:p></span></span></li>
<li class="MsoNormal"><span style="mso-bidi-font-family: "Times New Roman"; mso-fareast-font-family: "Times New Roman";"><span style="color: #cccccc;">3 threads read from
the 3 files in parallel (2G, 3G and GPRS) and insert into a shared task buffer<o:p></o:p></span></span></li>
<li class="MsoNormal"><span style="mso-bidi-font-family: "Times New Roman"; mso-fareast-font-family: "Times New Roman";"><span style="color: #cccccc;">1 thread
consumes from the buffer<o:p></o:p></span></span></li>
<li class="MsoNormal"><span style="mso-bidi-font-family: "Times New Roman"; mso-fareast-font-family: "Times New Roman";"><span style="color: #cccccc;">Reader threads
read line by line and insert into a data structure, which internally sorts
on timestamp via a custom comparator for each insert. It is basically a ConcurrentSkipListMap
backed by several ConcurrentLinkedQueue. The key for the Map is the
timestamp, and the value is the List of lines associated with that
timestamp. <o:p></o:p></span></span></li>
<li class="MsoNormal"><span style="mso-bidi-font-family: "Times New Roman"; mso-fareast-font-family: "Times New Roman";"><span style="color: #cccccc;">After the each
reader thread finishes inserting into the task queue, they wait on a
CyclicBarrier. The last thread to reach the barrier notifies the Consumer
that the file reading and sorting is completed into the queue<o:p></o:p></span></span></li>
<li class="MsoNormal"><span style="mso-bidi-font-family: "Times New Roman"; mso-fareast-font-family: "Times New Roman";"><span style="color: #cccccc;">Consumer is
awakened and just spits out the sorted Map into a CSV<o:p></o:p></span></span></li>
<li class="MsoNormal"><span style="mso-bidi-font-family: "Times New Roman"; mso-fareast-font-family: "Times New Roman";"><span style="color: #cccccc;">Once file is
written, CyclicBarrier is reset and cycle is repeated for next batch of files<o:p></o:p></span></span></li>
</ol>
<div class="MsoNormal" style="mso-margin-bottom-alt: auto; mso-margin-top-alt: auto;">
<br /></div>
<!--[if gte mso 9]><xml>
<o:OfficeDocumentSettings>
<o:AllowPNG/>
</o:OfficeDocumentSettings>
</xml><![endif]-->
<!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves/>
<w:TrackFormatting/>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:DoNotPromoteQF/>
<w:LidThemeOther>EN-US</w:LidThemeOther>
<w:LidThemeAsian>JA</w:LidThemeAsian>
<w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:SplitPgBreakAndParaMark/>
<w:EnableOpenTypeKerning/>
<w:DontFlipMirrorIndents/>
<w:OverrideTableStyleHps/>
<w:UseFELayout/>
</w:Compatibility>
<m:mathPr>
<m:mathFont m:val="Cambria Math"/>
<m:brkBin m:val="before"/>
<m:brkBinSub m:val="--"/>
<m:smallFrac m:val="off"/>
<m:dispDef/>
<m:lMargin m:val="0"/>
<m:rMargin m:val="0"/>
<m:defJc m:val="centerGroup"/>
<m:wrapIndent m:val="1440"/>
<m:intLim m:val="subSup"/>
<m:naryLim m:val="undOvr"/>
</m:mathPr></w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" DefUnhideWhenUsed="true"
DefSemiHidden="true" DefQFormat="false" DefPriority="99"
LatentStyleCount="276">
<w:LsdException Locked="false" Priority="0" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Normal"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="heading 1"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 2"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 3"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 4"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 5"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 6"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 7"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 8"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 9"/>
<w:LsdException Locked="false" Priority="39" Name="toc 1"/>
<w:LsdException Locked="false" Priority="39" Name="toc 2"/>
<w:LsdException Locked="false" Priority="39" Name="toc 3"/>
<w:LsdException Locked="false" Priority="39" Name="toc 4"/>
<w:LsdException Locked="false" Priority="39" Name="toc 5"/>
<w:LsdException Locked="false" Priority="39" Name="toc 6"/>
<w:LsdException Locked="false" Priority="39" Name="toc 7"/>
<w:LsdException Locked="false" Priority="39" Name="toc 8"/>
<w:LsdException Locked="false" Priority="39" Name="toc 9"/>
<w:LsdException Locked="false" Priority="35" QFormat="true" Name="caption"/>
<w:LsdException Locked="false" Priority="10" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Title"/>
<w:LsdException Locked="false" Priority="1" Name="Default Paragraph Font"/>
<w:LsdException Locked="false" Priority="11" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtitle"/>
<w:LsdException Locked="false" Priority="22" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Strong"/>
<w:LsdException Locked="false" Priority="20" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Emphasis"/>
<w:LsdException Locked="false" Priority="59" SemiHidden="false"
UnhideWhenUsed="false" Name="Table Grid"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Placeholder Text"/>
<w:LsdException Locked="false" Priority="1" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="No Spacing"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 1"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 1"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 1"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 1"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 1"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Revision"/>
<w:LsdException Locked="false" Priority="34" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="List Paragraph"/>
<w:LsdException Locked="false" Priority="29" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Quote"/>
<w:LsdException Locked="false" Priority="30" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Quote"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 1"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 1"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 1"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 1"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 1"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 1"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 1"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 2"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 2"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 2"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 2"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 2"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 2"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 2"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 2"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 2"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 2"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 2"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 3"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 3"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 3"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 3"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 3"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 3"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 3"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 3"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 3"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 3"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 3"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 3"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 3"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 4"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 4"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 4"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 4"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 4"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 4"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 4"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 4"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 4"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 4"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 4"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 4"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 4"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 4"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 5"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 5"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 5"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 5"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 5"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 5"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 5"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 5"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 5"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 5"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 5"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 5"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 5"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 5"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 6"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 6"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 6"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 6"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 6"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 6"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 6"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 6"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 6"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 6"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 6"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 6"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 6"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 6"/>
<w:LsdException Locked="false" Priority="19" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Emphasis"/>
<w:LsdException Locked="false" Priority="21" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Emphasis"/>
<w:LsdException Locked="false" Priority="31" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Reference"/>
<w:LsdException Locked="false" Priority="32" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Reference"/>
<w:LsdException Locked="false" Priority="33" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Book Title"/>
<w:LsdException Locked="false" Priority="37" Name="Bibliography"/>
<w:LsdException Locked="false" Priority="39" QFormat="true" Name="TOC Heading"/>
</w:LatentStyles>
</xml><![endif]-->
<!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin:0in;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:Cambria;
mso-ascii-font-family:Cambria;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Cambria;
mso-hansi-theme-font:minor-latin;}
</style>
<![endif]-->
<!--StartFragment-->
<!--EndFragment--><br />
Here is a snippet of the sorted map
<br />
<script src="https://gist.github.com/sourabhghose/e5fc01b555cf343d675c.js"></script>
<br />
A snippet of the reader (producer) thread looks like this
<br />
<script src="https://gist.github.com/sourabhghose/0d2728c3895cc80edaad.js"></script>
<br />
And a snippet of the writer (consumer) thread looks like this
<br/>
<script src="https://gist.github.com/sourabhghose/8479adda22bb72ec58a0.js"></script>
<br/>
<div class="MsoNormal">
A sample flow of the threads with CyclicBarrier coordination is as follows –<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhtaZMLTm50MmVkqJjZQKSGpk7__YiBvpIZ4qPspzfjcvyfNzAGMLhqQe1QnAwObbHCZstaVhONoK-2NamWYqHO7FTU2ycKvluJGObrvX584u_M_OfvEGhh1q9-AimFWDY55HO0VmVjDpg/s1600/Screen+Shot+2013-09-26+at+1.29.50+PM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhtaZMLTm50MmVkqJjZQKSGpk7__YiBvpIZ4qPspzfjcvyfNzAGMLhqQe1QnAwObbHCZstaVhONoK-2NamWYqHO7FTU2ycKvluJGObrvX584u_M_OfvEGhh1q9-AimFWDY55HO0VmVjDpg/s1600/Screen+Shot+2013-09-26+at+1.29.50+PM.png" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="MsoNormal">
On a 4 core, 35Gb machine this with 120 Gb worth of data
files, this took about 2 hours to complete. This solution worked for my use
case, however, optimally you would read only a predefined number of lines into
the buffer rather than the entire file to avoid heap pressure.<o:p></o:p></div>
<div class="separator" style="clear: both; text-align: left;">
<!--[if gte mso 9]><xml>
<o:OfficeDocumentSettings>
<o:AllowPNG/>
</o:OfficeDocumentSettings>
</xml><![endif]-->
<!--[if gte mso 9]><xml>
<w:WordDocument>
<w:View>Normal</w:View>
<w:Zoom>0</w:Zoom>
<w:TrackMoves/>
<w:TrackFormatting/>
<w:PunctuationKerning/>
<w:ValidateAgainstSchemas/>
<w:SaveIfXMLInvalid>false</w:SaveIfXMLInvalid>
<w:IgnoreMixedContent>false</w:IgnoreMixedContent>
<w:AlwaysShowPlaceholderText>false</w:AlwaysShowPlaceholderText>
<w:DoNotPromoteQF/>
<w:LidThemeOther>EN-US</w:LidThemeOther>
<w:LidThemeAsian>JA</w:LidThemeAsian>
<w:LidThemeComplexScript>X-NONE</w:LidThemeComplexScript>
<w:Compatibility>
<w:BreakWrappedTables/>
<w:SnapToGridInCell/>
<w:WrapTextWithPunct/>
<w:UseAsianBreakRules/>
<w:DontGrowAutofit/>
<w:SplitPgBreakAndParaMark/>
<w:EnableOpenTypeKerning/>
<w:DontFlipMirrorIndents/>
<w:OverrideTableStyleHps/>
<w:UseFELayout/>
</w:Compatibility>
<m:mathPr>
<m:mathFont m:val="Cambria Math"/>
<m:brkBin m:val="before"/>
<m:brkBinSub m:val="--"/>
<m:smallFrac m:val="off"/>
<m:dispDef/>
<m:lMargin m:val="0"/>
<m:rMargin m:val="0"/>
<m:defJc m:val="centerGroup"/>
<m:wrapIndent m:val="1440"/>
<m:intLim m:val="subSup"/>
<m:naryLim m:val="undOvr"/>
</m:mathPr></w:WordDocument>
</xml><![endif]--><!--[if gte mso 9]><xml>
<w:LatentStyles DefLockedState="false" DefUnhideWhenUsed="true"
DefSemiHidden="true" DefQFormat="false" DefPriority="99"
LatentStyleCount="276">
<w:LsdException Locked="false" Priority="0" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Normal"/>
<w:LsdException Locked="false" Priority="9" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="heading 1"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 2"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 3"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 4"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 5"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 6"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 7"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 8"/>
<w:LsdException Locked="false" Priority="9" QFormat="true" Name="heading 9"/>
<w:LsdException Locked="false" Priority="39" Name="toc 1"/>
<w:LsdException Locked="false" Priority="39" Name="toc 2"/>
<w:LsdException Locked="false" Priority="39" Name="toc 3"/>
<w:LsdException Locked="false" Priority="39" Name="toc 4"/>
<w:LsdException Locked="false" Priority="39" Name="toc 5"/>
<w:LsdException Locked="false" Priority="39" Name="toc 6"/>
<w:LsdException Locked="false" Priority="39" Name="toc 7"/>
<w:LsdException Locked="false" Priority="39" Name="toc 8"/>
<w:LsdException Locked="false" Priority="39" Name="toc 9"/>
<w:LsdException Locked="false" Priority="35" QFormat="true" Name="caption"/>
<w:LsdException Locked="false" Priority="10" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Title"/>
<w:LsdException Locked="false" Priority="1" Name="Default Paragraph Font"/>
<w:LsdException Locked="false" Priority="11" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtitle"/>
<w:LsdException Locked="false" Priority="22" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Strong"/>
<w:LsdException Locked="false" Priority="20" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Emphasis"/>
<w:LsdException Locked="false" Priority="59" SemiHidden="false"
UnhideWhenUsed="false" Name="Table Grid"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Placeholder Text"/>
<w:LsdException Locked="false" Priority="1" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="No Spacing"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 1"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 1"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 1"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 1"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 1"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 1"/>
<w:LsdException Locked="false" UnhideWhenUsed="false" Name="Revision"/>
<w:LsdException Locked="false" Priority="34" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="List Paragraph"/>
<w:LsdException Locked="false" Priority="29" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Quote"/>
<w:LsdException Locked="false" Priority="30" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Quote"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 1"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 1"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 1"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 1"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 1"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 1"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 1"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 1"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 2"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 2"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 2"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 2"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 2"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 2"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 2"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 2"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 2"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 2"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 2"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 2"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 2"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 2"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 3"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 3"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 3"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 3"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 3"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 3"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 3"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 3"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 3"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 3"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 3"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 3"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 3"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 3"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 4"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 4"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 4"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 4"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 4"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 4"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 4"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 4"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 4"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 4"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 4"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 4"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 4"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 4"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 5"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 5"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 5"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 5"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 5"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 5"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 5"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 5"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 5"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 5"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 5"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 5"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 5"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 5"/>
<w:LsdException Locked="false" Priority="60" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Shading Accent 6"/>
<w:LsdException Locked="false" Priority="61" SemiHidden="false"
UnhideWhenUsed="false" Name="Light List Accent 6"/>
<w:LsdException Locked="false" Priority="62" SemiHidden="false"
UnhideWhenUsed="false" Name="Light Grid Accent 6"/>
<w:LsdException Locked="false" Priority="63" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 1 Accent 6"/>
<w:LsdException Locked="false" Priority="64" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Shading 2 Accent 6"/>
<w:LsdException Locked="false" Priority="65" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 1 Accent 6"/>
<w:LsdException Locked="false" Priority="66" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium List 2 Accent 6"/>
<w:LsdException Locked="false" Priority="67" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 1 Accent 6"/>
<w:LsdException Locked="false" Priority="68" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 2 Accent 6"/>
<w:LsdException Locked="false" Priority="69" SemiHidden="false"
UnhideWhenUsed="false" Name="Medium Grid 3 Accent 6"/>
<w:LsdException Locked="false" Priority="70" SemiHidden="false"
UnhideWhenUsed="false" Name="Dark List Accent 6"/>
<w:LsdException Locked="false" Priority="71" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Shading Accent 6"/>
<w:LsdException Locked="false" Priority="72" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful List Accent 6"/>
<w:LsdException Locked="false" Priority="73" SemiHidden="false"
UnhideWhenUsed="false" Name="Colorful Grid Accent 6"/>
<w:LsdException Locked="false" Priority="19" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Emphasis"/>
<w:LsdException Locked="false" Priority="21" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Emphasis"/>
<w:LsdException Locked="false" Priority="31" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Subtle Reference"/>
<w:LsdException Locked="false" Priority="32" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Intense Reference"/>
<w:LsdException Locked="false" Priority="33" SemiHidden="false"
UnhideWhenUsed="false" QFormat="true" Name="Book Title"/>
<w:LsdException Locked="false" Priority="37" Name="Bibliography"/>
<w:LsdException Locked="false" Priority="39" QFormat="true" Name="TOC Heading"/>
</w:LatentStyles>
</xml><![endif]-->
<!--[if gte mso 10]>
<style>
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-parent:"";
mso-padding-alt:0in 5.4pt 0in 5.4pt;
mso-para-margin:0in;
mso-para-margin-bottom:.0001pt;
mso-pagination:widow-orphan;
font-size:12.0pt;
font-family:Cambria;
mso-ascii-font-family:Cambria;
mso-ascii-theme-font:minor-latin;
mso-hansi-font-family:Cambria;
mso-hansi-theme-font:minor-latin;}
</style>
<![endif]-->
<!--StartFragment-->
<!--EndFragment--></div>
<div class="MsoNormal">
On a side note, I tested this with the CMS and G1 GC and I
found CMS to be much more performant and predictable for my use case. <br />
The entire code base is available here <a href="https://github.com/sourabhghose/LargeFileMergeSort"> https://github.com/sourabhghose/LargeFileMergeSort </a>
<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
</div>Sourabh Ghosehttp://www.blogger.com/profile/12406723074994150793noreply@blogger.com0tag:blogger.com,1999:blog-633311527952150364.post-19179659768624468172012-12-27T03:18:00.002-08:002013-01-03T03:46:50.387-08:00Clustered Spring SessionRegistry: Spring Security Concurrent sessions in a clustered environment<script type="text/javascript">var dzone_url = '[url]';</script>
<script type="text/javascript">var dzone_title = '[title]';</script>
<script type="text/javascript">var dzone_blurb = '[description]';</script>
<script type="text/javascript">var dzone_style = '2';</script>
<script language="javascript" src="http://widgets.dzone.com/links/widgets/zoneit.js"></script>
<div dir="ltr" style="text-align: left;" trbidi="on">
<span class="Apple-style-span" style="font-family: inherit;"><br /></span>
<span class="Apple-style-span" style="font-family: inherit;"><br /></span>
<span class="Apple-style-span" style="font-family: inherit;">I was recently working with a client whose application<span class="Apple-style-span" style="line-height: 17px;"> </span><span class="Apple-style-span" style="line-height: 17px;">has a login module developed on spring security that was to be clustered using web sessions. The requirement was that if the user logs in twice into the system, the previous session should be invalidated and redirected to an error page. </span></span><br />
<span class="Apple-style-span" style="line-height: 17px;"><span class="Apple-style-span" style="font-family: inherit;">Spring security provides this functionality out of the box using ConcurrentSessionFilter. This filter makes use of an internal session registry that keep track of what users have logged in and their session details. </span></span><br />
<div>
<span class="Apple-style-span" style="line-height: 17px;"><span class="Apple-style-span" style="color: white; font-family: inherit;"><a href="http://static.springsource.org/spring-security/site/docs/3.0.x/reference/session-mgmt.html" style="text-decoration: none;" title="Follow link">http://static.springsource.org/spring-security/site/docs/3.0.x/reference/session-mgmt.html</a></span></span></div>
<div>
<span class="Apple-style-span" style="line-height: 17px;"><br /></span></div>
<div>
<span class="Apple-style-span" style="line-height: 17px;"><span class="Apple-style-span" style="font-family: inherit;">However while integrating with Terracotta web sessions we found that this did not work. After investigation we found that the default session registry implementation was not clustered. I.e. it was creating local copies of session on each server. Thus the user was able to login multiple times.<br /><br />In order to make this work the SessionRegistry work in a distributed environment or in front of a load balancer, we created a custom session registry on top of Ehcache. All the session details are now populated into Ehcache and get replicated across the servers. Hence a login session created on one server was now visible to the other servers and we could invalidate the previous session.</span></span></div>
<div>
<span class="Apple-style-span" style="line-height: 17px;"><br /></span></div>
<div>
<span class="Apple-style-span" style="line-height: 17px;"><span class="Apple-style-span" style="font-family: inherit;">Attached project is tested on </span></span><span class="Apple-style-span" style="line-height: 17px;"><span class="Apple-style-span" style="font-family: inherit;">Terracotta 3.7.2, Jboss 7, Spring 3.0.5.</span></span><br />
<span class="Apple-style-span" style="line-height: 17px;"><span class="Apple-style-span" style="font-family: inherit;"><br /></span></span>
<span class="Apple-style-span" style="line-height: 17px;">Download the code <a href="https://docs.google.com/open?id=0ByVw97rTI_gCc2NXa1pTY283OGc" target="_blank">here</a>.</span></div>
</div>
Sourabh Ghosehttp://www.blogger.com/profile/12406723074994150793noreply@blogger.com4tag:blogger.com,1999:blog-633311527952150364.post-35786912237723285202012-09-04T05:16:00.002-07:002012-12-28T02:30:49.602-08:00Plugging in Ehcache into iBATIS<script type="text/javascript">var dzone_url = '[url]';</script>
<script type="text/javascript">var dzone_title = '[title]';</script>
<script type="text/javascript">var dzone_blurb = '[description]';</script>
<script type="text/javascript">var dzone_style = '2';</script>
<script language="javascript" src="http://widgets.dzone.com/links/widgets/zoneit.js"></script>
<div dir="ltr" style="text-align: left;" trbidi="on">
<div dir="ltr" style="text-align: left;" trbidi="on">
<br /></div>
As you probably know that Ehcache is the Hibernate’s default 2nd level cache. Integrating Ehcache into iBATIS is relatively easy using iBATIS’s CacheController interface. Using the CacheController interface you can plugin your own custom caching solution or plug in any third party caching solution. The javadoc for CacheController is <a href="http://ibatis.apache.org/docs/java/dev/com/ibatis/sqlmap/engine/cache/CacheController.html" target="_blank">here</a>.
In order to plugin Ehcache, you must implement the CacheController interface as follows.<br />
<br />
<h3 style="text-align: left;">
Implementing the CacheController:
</h3>
<pre class="java" name="code">import java.io.File;
import java.net.URL;
import java.util.Properties;
import net.sf.ehcache.Element;
import net.sf.ehcache.CacheManager;
import net.sf.ehcache.Cache;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.ibatis.sqlmap.engine.cache.CacheModel;
import com.ibatis.sqlmap.engine.cache.CacheController;
public class EhcacheIbatisCacheController implements CacheController {
final private static Logger logger = LoggerFactory.getLogger(EhcacheIbatisCacheController.class);
/** EhCache CacheManager. */
private CacheManager cacheManager;
/**
* Flush a cache model.
* @param cacheModel - the model to flush.
*/
public void flush(CacheModel cacheModel) {
getCache(cacheModel).removeAll();
}
/**
* Get an object from a cache model.
* @param cacheModel - the model.
* @param key - the key to the object.
* @return the object if in the cache.
*/
public Object getObject(CacheModel cacheModel, Object key) {
Object result = null;
try {
Element element = getCache(cacheModel).get(key.toString());
if (element != null) {
result = element.getObjectValue();
}
}
catch(Exception e) {
logger.debug("cache miss, will check in db");
}
return result;
}
/**
* Put an object into a cache model.
* @param cacheModel - the model to add the object to.
* @param key - the key to the object.
* @param object - the value to add.
*/
public void putObject(CacheModel cacheModel, Object key, Object object) {
getCache(cacheModel).put(new Element(key.toString(), object));
}
/**
* Remove an object from a cache model.
* @param cacheModel - the model to remove the object from.
* @param key - the key to the object.
* @return the removed object.
*/
public Object removeObject(CacheModel cacheModel, Object key) {
Object result = this.getObject(cacheModel, key.toString());
getCache(cacheModel).remove(key.toString());
return result;
}
/**
* Configure a cache controller. Initialize the Cache Manager of Ehcache
* @param props - the properties object containing configuration information.
*/
public void setProperties(Properties props) {
String configFile = props.getProperty("configFile");
File file = new File(configFile);
if(file.exists()) {
cacheManager = CacheManager.create(file.getAbsolutePath());
}
else {
URL url = getClass().getResource(configFile);
cacheManager = CacheManager.create(url);
}
}
/**
* Gets a ehcache based on an iBatis cache Model.
* @param cacheModel - the cache model.
* @return the Cache.
*/
private Cache getCache(CacheModel cacheModel) {
String cacheName = cacheModel.getId();
return cacheManager.getCache(cacheName);
}
}
</pre>
Each method provides access to the CacheModel that controls the cache so you can access parameters in the CacheModel when required.<br />
<br />
<h3 style="text-align: left;">
Registering EhcacheIbatisController with iBATIS: </h3>
<br />
CacheModels and CacheControllers must be registered in the ibatis xml.
1st create an alias for the Ehcache controller as follows:
<br />
<pre class="xml" name="code"><typealias alias="EhcacheIbatisController" type=" EhcacheIbatisController ">
</typealias></pre>
Now we need to apply the Cache controller to a CacheModel definition in the xml.
<br />
<pre class="xml" name="code"><cachemodel id="EmployeeCache" type=" EhcacheIbatisController ">
<flushinterval hours="1">
<flushonexecute statement="updateEmployee">
<flushonexecute statement="insertEmployee">
<flushonexecute statement="deleteEmployee">
</flushonexecute></flushonexecute></flushonexecute></flushinterval></cachemodel>
</pre>
</div>
Sourabh Ghosehttp://www.blogger.com/profile/12406723074994150793noreply@blogger.com1tag:blogger.com,1999:blog-633311527952150364.post-15939966676242490102011-10-16T02:54:00.000-07:002012-12-28T02:32:22.137-08:00EHCache - Write behind example<script type="text/javascript">var dzone_url = '[url]';</script>
<script type="text/javascript">var dzone_title = '[title]';</script>
<script type="text/javascript">var dzone_blurb = '[description]';</script>
<script type="text/javascript">var dzone_style = '2';</script>
<script language="javascript" src="http://widgets.dzone.com/links/widgets/zoneit.js"></script>
<div dir="ltr" style="text-align: left;" trbidi="on">
<div dir="ltr" style="text-align: left;" trbidi="on">
<div dir="ltr" style="text-align: left;" trbidi="on">
<h2>
What is Write-Behind?</h2>
<div>
Write behind is asynchronous writing of data to the underlying database. Thus, when data is being written to the Cache, instead of writing simultaneously to the database, the cache saves the data into a queue and allows a background thread to write to the database later. </div>
<div>
<br /></div>
<div>
This is a transformative capability because now you can:</div>
<div>
<ol style="text-align: left;">
<li>Move writes to the database at a particular time</li>
<li>Use write coalescing, which means if there are multiple updates on the same key in the queue, only the latest one is considered</li>
<li>Batch multiple write operations</li>
<li>Specify the number of retry attempts in case of write failure</li>
</ol>
<div>
Here is an introductory <a href="http://vimeo.com/21193026" target="_blank">video</a>.</div>
</div>
<div>
<br /></div>
<div>
In order to write behind, you need to first implement the CacheWriter interface</div>
</div>
<pre class="java" name="code">/*
This class handles writing to the database or your backend persistence storage
*/
public class EhcacheWriteBehindClass implements CacheWriter {
@Override
public CacheWriter clone(Ehcache arg0) throws CloneNotSupportedException {
throw new CloneNotSupportedException("EhcacheWriteBehindClass cannot be cloned!");
}
@Override
public void delete(CacheEntry arg0) throws CacheException {
// TODO Auto-generated method stub
}
@Override
public void deleteAll(Collection<cacheentry> arg0) throws CacheException {
// TODO Auto-generated method stub
}
@Override
public void dispose() throws CacheException {
// You can close database connections here
}
@Override
public void init() {
// You can initialize the database here
}
@Override
public void write(Element arg0) throws CacheException {
// Typically you would write to your database here
System.out.println("Write : Key is " + arg0.getKey());
System.out.println("Write : Value is " + arg0.getValue());
}
@Override
public void writeAll(Collection<element> arg0) throws CacheException {
// TODO Auto-generated method stub
System.out.println("Write All");
}
@Override
public void throwAway(Element arg0, SingleOperationType arg1,
RuntimeException arg2) {
// TODO Auto-generated method stub
}
}
</pre>
<div>
This class is instantiated by the CacheWriterFactory: </div>
<br />
<pre class="java" name="code">public class WriteBehindClassFactory extends CacheWriterFactory {
public CacheWriter createCacheWriter(Ehcache arg0, Properties arg1) {
return new EhcacheWriteBehindClass();
}
}
</pre>
</div>
Now register the factory in the ehcache.xml as follows:
<br />
<pre class="xml" name="code">
<cache eternal="true" maxelementsinmemory="10000" maxelementsondisk="1000000" name="writeBehindCache" statistics="true">
<cachewriter maxwritedelay="10" notifylistenersonexception="true" ratelimitpersecond="5" retryattemptdelayseconds="2" retryattempts="2" writebatching="false" writebatchsize="20" writecoalescing="false" writemode="write-behind">
<cachewriterfactory class="WriteBehindClassFactory">
</cachewriterfactory>
</cachewriter>
</cache>
</pre>
<div>
In order to use this write behind functionality, your class would look like this:
</div>
<pre class="java" name="code">
public class EhcacheWriteBehindTest {
public static void main(String[] args) throws Exception {
// pass in the number of object you want to generate, default is 10
int numberOfObjects = Integer.parseInt(args.length == 0 ? "100": args[0]);
System.out.println(numberOfObjects);
//create the CacheManager
CacheManager cacheManager = CacheManager.getInstance();
//get a handle on the Cache - the name "myCache" is the name of a cache in the ehcache.xml file
Cache myCache = cacheManager.getCache("writeBehindCache");
//iterate through numberOfObjects and use the iterator as the key, value does not matter at this time
for (int i = 0; i < numberOfObjects; i++) {
String key = new Integer(i).toString();
if (!checkInCache(key, myCache)) {
//when putting in the cache, it is as an Element, the key and the value must be serializable
myCache.putWithWriter(new Element(key, "Value"));
System.out.println(key + " NOT in cache!!!");
} else {
System.out.println("Put with writer ... value1");
//note, we use the putWithWriter method and not the put method
myCache.putWithWriter(new Element(key, "Value1"));
}
}
while (true) {
Thread.sleep(1000);
}
}
//check to see if the key is in the cache
private static boolean checkInCache(String key, Cache myCache) throws Exception {
Element element = myCache.get(key);
boolean returnValue = false;
if (element != null) {
System.out.println(key + " is in the cache!!!");
returnValue = true;
}
return returnValue;
}
}
</pre>
<div>
Thats it! For a detailed explanation of the configurations involved have a look at <a href="http://ehcache.org/documentation/apis/write-through-caching#configuration" target="_blank">this</a>.<br />
<br />
The limitation of this is that if your JVM goes down, your write-behind queue is lost. In order to avoid this you can used clustered Terracotta, which uses the <a href="http://terracotta.org/documentation/terracotta-server-array/introduction" target="_blank">Terracotta Server Array</a>. In this case the queue is maintained at the Terracotta Server Array which provides HA features. <span class="Apple-style-span" style="line-height: 17px;"><span class="Apple-style-span" style="font-family: inherit;">If one client JVM were to go down, any changes it put into the write-behind queue can always be loaded by threads in other clustered JVMs, therefore will be applied to the database without any data loss. </span></span><br />
<span class="Apple-style-span" style="line-height: 17px;"><span class="Apple-style-span" style="font-family: inherit;"><br /></span></span><br />
<span class="Apple-style-span" style="line-height: 17px;"><span class="Apple-style-span" style="font-family: inherit;">Terracotta Server Array is an enterprise feature and can be configured extremely easily. You can download a trial version from <a href="http://terracotta.org/downloads" target="_blank">here</a>. </span></span><br />
<span class="Apple-style-span" style="line-height: 17px;"><span class="Apple-style-span" style="font-family: inherit;"><br /></span></span><br />
<span class="Apple-style-span" style="line-height: 17px;"><span class="Apple-style-span" style="font-family: inherit;">The only change you need to make in this app to make it clustered is in the ehcache.xml. You ehcache.xml would now look like this:</span></span><br />
<span class="Apple-style-span" style="line-height: 17px;"><span class="Apple-style-span" style="font-family: inherit;"><br /></span></span><br />
<br />
<div>
</div>
</div>
<pre class="xml" name="code"> <cache eternal="true" maxelementsinmemory="10000" maxelementsondisk="1000000" name="writeBehindCache" statistics="true">
<cachewriter maxwritedelay="10" notifylistenersonexception="true" ratelimitpersecond="5" retryattemptdelayseconds="2" retryattempts="2" writebatching="false" writebatchsize="20" writecoalescing="false" writemode="write-behind">
<cachewriterfactory class="WriteBehindClassFactory">
</cachewriterfactory></cachewriter>
<terracotta storagestrategy="DCV2">
</terracotta></cache>
<terracottaconfig url="localhost:9510">
</terracottaconfig></pre>
<div>
terracottaConfig url="localhost:9510" is where your Terracotta Server Array runs. </div>
</div>Sourabh Ghosehttp://www.blogger.com/profile/12406723074994150793noreply@blogger.com11tag:blogger.com,1999:blog-633311527952150364.post-46488440420580090972011-08-17T11:22:00.000-07:002012-12-28T02:31:17.655-08:00How to keep the Database in sync with your cache?<script type="text/javascript">var dzone_url = '[url]';</script>
<script type="text/javascript">var dzone_title = '[title]';</script>
<script type="text/javascript">var dzone_blurb = '[description]';</script>
<script type="text/javascript">var dzone_style = '2';</script>
<script language="javascript" src="http://widgets.dzone.com/links/widgets/zoneit.js"></script>
<div dir="ltr" style="text-align: left;" trbidi="on">
<br />
<div class="MsoNormal">
There are few different ways to achieve this. You can put
the onus on the underlying cache to fetch the data periodically or when it
determines that the data is stale. Secondly you can put the onus on the
underlying database to “push” updates periodically or when the data is updated.<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<h2>
Read Heavy use cases<o:p></o:p></h2>
<h2>
Cache -> DB<o:p></o:p></h2>
<div class="MsoNormal">
<br /></div>
<div class="MsoListParagraph" style="mso-list: l1 level1 lfo2; text-indent: -.25in;">
1.<span style="font: normal normal normal 7pt/normal 'Times New Roman';">
</span>The most straightforward way is to set the <b style="mso-bidi-font-weight: normal;">Time to Live (TTL)</b> and <b style="mso-bidi-font-weight: normal;">Time to Idle (TTI)</b> on the cache so the
data will expire periodically. The next request will result in a cache miss and
your application will pull the current value from the underlying database and
put it into the cache.<o:p></o:p></div>
<div class="MsoNormal" style="text-indent: .5in;">
Few things to note here are:<o:p></o:p></div>
<div class="MsoListParagraphCxSpFirst" style="margin-left: .75in; mso-add-space: auto; mso-list: l0 level1 lfo1; text-indent: -.25in;">
a.<span style="font: normal normal normal 7pt/normal 'Times New Roman';">
</span>There might be a window when data is in cache
and is not in synch with the underlying database<o:p></o:p></div>
<div class="MsoListParagraphCxSpLast" style="margin-left: .75in; mso-add-space: auto; mso-list: l0 level1 lfo1; text-indent: -.25in;">
b.<span style="font: normal normal normal 7pt/normal 'Times New Roman';">
</span>A cache miss can be interpreted as a performance
hit.<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal" style="text-indent: .5in;">
This is called <b style="mso-bidi-font-weight: normal;">read-through caching</b>.<o:p></o:p></div>
<div class="MsoNormal" style="text-indent: .5in;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiy9VmvHTocOTy081eQO0eWMq9mN1fa1PHLWW7tZehjW_VjGIw-0F1TqzhkFzFfGPBc4BKizh-DiE9wOnGhHwdDj_z5gBYgP1Bc0mAd40F3PuewidWkL5r0K3on3dJZUmP6BXrnVrJnHMo/s1600/Screen+Shot+2012-07-21+at+11.45.38+PM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEiy9VmvHTocOTy081eQO0eWMq9mN1fa1PHLWW7tZehjW_VjGIw-0F1TqzhkFzFfGPBc4BKizh-DiE9wOnGhHwdDj_z5gBYgP1Bc0mAd40F3PuewidWkL5r0K3on3dJZUmP6BXrnVrJnHMo/s320/Screen+Shot+2012-07-21+at+11.45.38+PM.png" width="258" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
</div>
<div class="MsoListParagraph" style="mso-list: l0 level1 lfo1; text-indent: -.25in;">
1.<span style="font: normal normal normal 7pt/normal 'Times New Roman';">
</span>An alternate approach is to perform cache
updates or invalidation periodically - use a batch process (could be scheduled
using open source Quartz) running in periodic intervals to either
invalidate the cache or update the cache. You could do this by using <a href="http://ehcache.org/apidocs/net/sf/ehcache/constructs/blocking/SelfPopulatingCache.html">SelfPopulating</a>
Ehcache.<o:p></o:p></div>
<div class="MsoListParagraph" style="mso-list: l0 level1 lfo1; text-indent: -.25in;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhlDs-6vOwD15c2-gFWwfgTD5fXFROcOH6qNBt_t1zU66A6R76LP3sNwPgJncXTws1f340wRIqX0Dry74CEm9oMkh2GUwkh83GivFKSAyq7LO7s094YRxzyTC3_OG-02nTHkBuZOXC9WA4/s1600/Screen+Shot+2012-07-21+at+11.46.17+PM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhlDs-6vOwD15c2-gFWwfgTD5fXFROcOH6qNBt_t1zU66A6R76LP3sNwPgJncXTws1f340wRIqX0Dry74CEm9oMkh2GUwkh83GivFKSAyq7LO7s094YRxzyTC3_OG-02nTHkBuZOXC9WA4/s320/Screen+Shot+2012-07-21+at+11.46.17+PM.png" width="318" /></a></div>
<div class="separator" style="clear: both; text-align: center;">
<br /></div>
<div class="separator" style="clear: both; text-align: left;">
</div>
<h3>
DB->Cache<o:p></o:p></h3>
<div class="MsoNormal">
<br /></div>
<div class="MsoListParagraph" style="mso-layout-grid-align: none; mso-list: l1 level1 lfo1; mso-pagination: none; text-autospace: none; text-indent: -.25in;">
1.<span style="font: normal normal normal 7pt/normal 'Times New Roman';">
</span>You could also transfer the synching onus to the
underlying database itself. For ex. Oracle AQ provides a way to register a call
back when any database updates happen. This can be leveraged to either
invalidate or update the cache store.<o:p></o:p></div>
<div class="MsoNormal" style="mso-layout-grid-align: none; mso-pagination: none; text-autospace: none;">
<br /></div>
<div class="MsoListParagraph" style="mso-layout-grid-align: none; mso-list: l1 level1 lfo1; mso-pagination: none; text-autospace: none; text-indent: -.25in;">
2.<span style="font: normal normal normal 7pt/normal 'Times New Roman';">
</span>Alternatively you could also use middleware
technologies like GoldenGate, JMS to capture DB changes when they occur to
"push" notifications into the Memory Store.<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<h2>
<o:p> </o:p></h2>
<h2>
Write Heavy use cases<o:p></o:p></h2>
<div class="MsoNormal">
<br /></div>
<div class="MsoListParagraph" style="mso-list: l0 level1 lfo2; text-indent: -.25in;">
1.<span style="font: normal normal normal 7pt/normal 'Times New Roman';">
</span>There are scenario’s that require frequent
updates to stored data. Every update to the cached data must invoke a
simultaneous update to the database at the same time. This is the Write-through
feature provided by Ehcache. However, updates to the database are almost always
slower, so this slows the effective update rate to the cache and thus the
performance in general. When many write requests come in at the same time, the
database can easily become a bottleneck or, even worse, be killed by heavy
writes in a short period of time. The <a href="http://terracotta.org/documentation/enterprise-ehcache/api-guide#26758">Write-behind</a>
feature provided by Ehcache allows quick cache writes with ensured consistency
between cache data and database. The idea is that when writing data into the
cache, instead of writing the data into database at the same time, the
write-behind cache saves the changed data into a queue and lets a backend
thread to do the writing later. Therefore, the cache-write process can proceed
without waiting for the database-write and, thus, be finished much faster. Any
data that has been changed can be persisted into database eventually. In the
mean time, any read from cache will still get the latest data.<o:p></o:p></div>
<div class="MsoListParagraph" style="mso-list: l0 level1 lfo2; text-indent: -.25in;">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEifl6DpqO9wMC0i1JloZ4pIBRY0VcLbxNnK1DvcbucL1RVRg_G7SSySc13O9VymU1u8Omb2fyBVqc9NaN3rBcBETkRXj-dIHnZ_AN1uTIFcCkOvixSKWEk4QkTl5AwzFtpcOqbhjTLJIyc/s1600/Screen+Shot+2012-07-21+at+11.46.32+PM.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEifl6DpqO9wMC0i1JloZ4pIBRY0VcLbxNnK1DvcbucL1RVRg_G7SSySc13O9VymU1u8Omb2fyBVqc9NaN3rBcBETkRXj-dIHnZ_AN1uTIFcCkOvixSKWEk4QkTl5AwzFtpcOqbhjTLJIyc/s320/Screen+Shot+2012-07-21+at+11.46.32+PM.png" width="246" /></a></div>
<div class="separator" style="clear: both; text-align: left;">
In case of Terracotta, the Terracotta
Server Array maintains the write-behind queue. A thread on each JVM checks the
shared queue and save each data change left in the queue.</div>
<div class="separator" style="clear: both; text-align: left;">
</div>
<div class="MsoNormal" style="margin-left: .5in;">
<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoListParagraphCxSpFirst" style="mso-list: l0 level1 lfo2; text-indent: -.25in;">
1<span style="font: normal normal normal 7pt/normal 'Times New Roman';"> </span>Finally you could also make you application
update the cache and DB simultaneously. It is advisable to use transactions to
perform this in the following manner:<o:p></o:p></div>
<div class="MsoListParagraphCxSpMiddle" style="margin-left: 49.5pt; mso-add-space: auto; mso-list: l1 level1 lfo1; text-indent: 4.5pt;">
a.<span style="font: normal normal normal 7pt/normal 'Times New Roman';">
</span>Start a transaction<o:p></o:p></div>
<div class="MsoListParagraphCxSpMiddle" style="margin-left: 49.5pt; mso-add-space: auto; mso-list: l1 level1 lfo1; text-indent: 4.5pt;">
b.<span style="font: normal normal normal 7pt/normal 'Times New Roman';">
</span>Update the database<o:p></o:p></div>
<div class="MsoListParagraphCxSpMiddle" style="margin-left: 49.5pt; mso-add-space: auto; mso-list: l1 level1 lfo1; text-indent: 4.5pt;">
c.<span style="font: normal normal normal 7pt/normal 'Times New Roman';">
</span>Update the cache<o:p></o:p></div>
<div class="MsoListParagraphCxSpLast" style="margin-left: 49.5pt; mso-add-space: auto; mso-list: l1 level1 lfo1; text-indent: 4.5pt;">
d.<span style="font: normal normal normal 7pt/normal 'Times New Roman';">
</span>Commit the transaction<o:p></o:p></div>
<div class="MsoNormal" style="margin-left: 0.5in; text-align: left;">
Some points to remember are that your
update code is directly aware of the cache and there is a performance impact
since your update latency reflects both DB and cache update time.<o:p></o:p></div>
<br />
<div class="MsoListParagraph" style="text-align: left; text-indent: -0.25in;">
<br /></div>
<div class="MsoListParagraph" style="mso-list: l0 level1 lfo2; text-indent: -.25in;">
<br /></div>
<br />
<div class="MsoListParagraph" style="mso-list: l0 level1 lfo1; text-indent: -.25in;">
<br /></div>
<br />
<div class="MsoNormal" style="text-indent: .5in;">
<br /></div>
</div>Sourabh Ghosehttp://www.blogger.com/profile/12406723074994150793noreply@blogger.com5