{"id":581,"date":"2021-09-04T17:04:38","date_gmt":"2021-09-05T00:04:38","guid":{"rendered":"https:\/\/aplcs.com\/?p=581"},"modified":"2021-11-07T20:32:01","modified_gmt":"2021-11-07T20:32:01","slug":"what-is-a-delimited-text-file","status":"publish","type":"post","link":"https:\/\/aplcs.com\/?p=581","title":{"rendered":"What is a delimited text file?"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-post\" data-elementor-id=\"581\" class=\"elementor elementor-581\">\n\t\t\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-bdee3e6 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"bdee3e6\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-6b8841e\" data-id=\"6b8841e\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-c7f5198 elementor-widget elementor-widget-text-editor\" data-id=\"c7f5198\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>When I first started out in litigation support, I spent a lot of time fixing and updating data load files.\u00a0 Back in the early 2000s, every photocopy place was claiming to be experts of document scanning and eDiscovery; and so, we ended up with a lot of incorrect or messy load files.\u00a0 It was an &#8220;interesting&#8221; time.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-546298e elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"546298e\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-fb1ee99\" data-id=\"fb1ee99\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-c05b07c elementor-widget elementor-widget-text-editor\" data-id=\"c05b07c\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Of course, despite the mess and the weird formatting, the data load files are really just niche-specific delimited text files.\u00a0 Then what is a delimited text file?\u00a0\u00a0<\/p><p>A delimited text file is really just a plain text file with markers (delimiters) to help identify how the data separates into proper records and fields.<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-611bc57 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"611bc57\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-c706a89\" data-id=\"c706a89\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-fee9e07 elementor-widget elementor-widget-image\" data-id=\"fee9e07\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/aplcs.com\/wp-content\/uploads\/2021\/09\/tablescreenshot.png\" title=\"\" alt=\"\" loading=\"lazy\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-90d42ae elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"90d42ae\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-7c2b54b\" data-id=\"7c2b54b\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-c2cb719 elementor-widget elementor-widget-text-editor\" data-id=\"c2cb719\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Within any proper database or spreadsheet system (like the Excel table in the picture above), we can very easily distinguish the various pieces of information shown.\u00a0 We know where each column starts and stops, we can identify each record or row, and each field is easily isolated.\u00a0 The data is organized and properly accessible.<\/p><p>But what if I want to take the above data out of Excel and put it into another system, like Concordance or some other document review tool? Each software platform has its own setup and quirks.\u00a0 Its display options, the way it functions, the way the data is indexed, all of that is different from platform to platform.\u00a0 We can&#8217;t just shove Excel into Concordance or vice-versa because the two simply won&#8217;t understand each other.<\/p><p><em>But, plain text is universal.\u00a0\u00a0<\/em>We can convert data into plain text and load it into another platform as plain text, because almost everything recognizes text.<\/p><p>The trouble is, if we were to take the text out of Excel (or any other software), we loose the organization that the original software provided and end up with something like the below:<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-a76f418 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"a76f418\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-4c52d66\" data-id=\"4c52d66\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-9ea8c72 elementor-widget elementor-widget-image\" data-id=\"9ea8c72\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/aplcs.com\/wp-content\/uploads\/2021\/09\/plaintextjumble.png\" title=\"\" alt=\"\" loading=\"lazy\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-67daca9 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"67daca9\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-ea07fa9\" data-id=\"ea07fa9\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-28a6f4e elementor-widget elementor-widget-text-editor\" data-id=\"28a6f4e\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>We, along with whatever new system the data will be imported into, loose the ability to distinguish the individual records or columns.\u00a0 The information is now a jumbled mess, and unorganized data is essentially useless. So, we need a way to help delineate records and fields using just plain text.\u00a0 Those markers are delimiters.<\/p><p>The simplest and most common delimited text file is the CSV (Comma Separated Value) file, which uses the comma ( , ) as a delimiter and quotes ( &#8221; ) as text qualifiers, like so:<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-53135f5 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"53135f5\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-de72fd0\" data-id=\"de72fd0\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-e01daa8 elementor-widget elementor-widget-image\" data-id=\"e01daa8\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"image.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/aplcs.com\/wp-content\/uploads\/2021\/09\/csvSamplefile.png\" title=\"\" alt=\"\" loading=\"lazy\" \/>\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-d5bf2f9 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"d5bf2f9\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-6d63c45\" data-id=\"6d63c45\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-4e01ad8 elementor-widget elementor-widget-text-editor\" data-id=\"4e01ad8\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<p>Here we can see clearly the purpose of the comma and quotes.&nbsp; The comma separates each field value and the quotes inform us that despite there being multiple words with spaces, the entire phrase belongs in 1 single field.&nbsp; Thus:<\/p>\n<pre><span style=\"font-size: 14px;\">\"BIGBANK_00001677\",\"BIGBANK_00001680\",\"111.msg\",\"Kevin Lockhart &lt;BigBank&gt;\",\"Doug Stevens &lt;DStevens@bigbank.com\"<\/span><\/pre>\n<p><span style=\"font-size: 14px;\">The use of quotes ( &#8221; ) means that whatever is in between the quotes belong together in the same field, and the <strong>comma<\/strong>&nbsp;means&nbsp;whatever is next, belongs to a new field. This allows the information to be properly organized and imported into a new system.<\/span><\/p>\n<p><span style=\"font-size: 14px;\">The issue of course is that ( , ) and ( &#8221; ) are very prevalent within general documents in eDiscovery.&nbsp; If we were to use commas and quotes as the delimiters, most database systems would be confused by all the extra instances found in letters, reports, emails, etc.&nbsp; We would need delimiters that do not occur naturally or often &#8220;in the wild.&#8221;<\/span><\/p>\n<p><span style=\"font-size: 14px;\">In the late 80s, when Concordance was basically the only document review tool around, the standard delimiters introduced to the legal technology space were:&nbsp;&nbsp;<\/span><\/p>\n<pre>\u00b6&nbsp;ASCII Code (020) as the comma separator<\/pre>\n<pre>\u00fe&nbsp;ASCII Code (254) as the quote text qualifier<\/pre>\n<pre>\u00ae ASCII Code (174) as the new line indicator<\/pre>\n<p>(The ASCII Code numbers being how you can type those characters, by holding down the ALT key on your keyboard and typing the numbers on your number keypad)<\/p>\n<p>As the above 3 characters were highly unlikely to appear within business documents naturally they served as much better delimiters and qualifiers than characters like commas or quotes.&nbsp;&nbsp;<\/p>\n<p>And even though as eDiscovery has evolved and more sophisticated document review database systems have appeared on the market, most systems still recognize the Concordance delimiters.&nbsp;&nbsp;<\/p>\n<p><\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>When I first started out in litigation support, I spent a lot of time fixing and updating data load files.&nbsp; Back in the early 2000s, every photocopy place was claiming to be experts of document scanning and eDiscovery; and so, we ended up with a lot of incorrect or messy load files.&nbsp; It was an [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":615,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[29,23],"tags":[],"class_list":["post-581","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-databases","category-training"],"_links":{"self":[{"href":"https:\/\/aplcs.com\/index.php?rest_route=\/wp\/v2\/posts\/581"}],"collection":[{"href":"https:\/\/aplcs.com\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/aplcs.com\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/aplcs.com\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/aplcs.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=581"}],"version-history":[{"count":1,"href":"https:\/\/aplcs.com\/index.php?rest_route=\/wp\/v2\/posts\/581\/revisions"}],"predecessor-version":[{"id":616,"href":"https:\/\/aplcs.com\/index.php?rest_route=\/wp\/v2\/posts\/581\/revisions\/616"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/aplcs.com\/index.php?rest_route=\/wp\/v2\/media\/615"}],"wp:attachment":[{"href":"https:\/\/aplcs.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=581"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/aplcs.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=581"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/aplcs.com\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=581"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}