|
Tiny CSV Reader I. the requirements 201506
CSV files are handy for bulk processing and it is also human readable. Wikipedia has an article about the formalization*, we quote here as the following:
* https://en.wikipedia.org/wiki/Comma-separated_values last retrieved June,2015.
|
|
|
Tiny CSV Reader II. the Nature Language Expression 201506
It is important to identify or count the appearance of double quote literal.
Here we shorten it to quote.
If quotes appear continuously in odd times, the first quote it is a field border.
A record separator can be a new line literal. Before it, the field borders must appear even number of times, otherwise it is a literal inside a field. Namely a record separator is a new line with "quotes appearing continuously in odd times" appears even times before it. Further, the next valid new line as a record separator has always even number count of quotes before it.
A fields separator can be a comma literal. Same as the record separator, "quotes appearing continuously in odd times" appears even times before it always.
A field can be an empty String. A record can be an empty line. A whole file can be empty. If we don't handle them, we may halt the program before an expected exit/end.
|
|
|
Tiny CSV Reader III. Transfer the Nature Language Expression to Programmable Regular Expression 201506
Here after a long hike in a winding trail, we implement with Java for convenience
|
|
|
Tiny CSV Reader IV. The Proof of Concept Test File 201506
We post the test data file here. It provides limited scenarioes that in real life could make a lesser program fail.
|
|