Perl reading a csv file into an array
Pick the one that matches your needs. In each row there are fields separated with comma. Of course the separator can be any character as long as it is the same in the whole file. Anyway, the task was to summarize the number in the 3rd column.
The algorithm The process should go like this: Read in the file line by line. For each line, extract the 3rd column. Add the value to a central variable where we accumulate the sum. We have already learned earlier how to read a file line by line so we only need to know how to process each row and how to extract the 3rd column. I cannot use substr easily as the location of the 3rd field is changing. What is fixed is that it is between the 2nd and the 3rd comma.
I could use index 3 times on each row to locate the 2nd and the 3rd comma, and then use substr but Perl has a much easier way for this. Using split split usually gets two parameters. The first is a knife, the second is the string that needs to be cut in pieces. The knife is actually a regular expression but for now we can stick to simple strings there. The array fields will be filled with 4 values: "Tudor", "Vidor", "10" and "Hapci". Comma in the field Every time you get a CSV file you can use this script to add up the values in the 3rd column.
Unfortunately at some point you get warnings while running your script. This is totally normal within the "standard" of CSV, but our script cannot properly handle the situation. It just cuts where it finds the separator character. We need a more robust solution to read CSV files. Even if you don't know what OOP is, you don't have to worry. We won't really learn OOP at this point, we'll just use the module.
We learn a little more syntax and a few expression, just so, that people who are familiar with object oriented programming can connect to their knowledge. It provides a set of new functionality, namely reading, parsing and writing CSV files. Perl programmers call these 3rd-party extension modules, though people coming from other languages might be more familiar with words such as library or extension. Following steps are followed to split lines of a CSV file into parts using a delimiter: Step 1: Read in the file line by line.
Step 2: For each line, store all value in an array. Following is a code for split function to separate the strings stored in the new.
Here, we are going to save it as test. Execute the above-saved file with the use of the following command: perl test. In such a situation if a split function is used, even if within quotes, then it will separate the values each time it gets a comma as a delimiter, because split function does not care about the quotes, nor does it understand anything about CSV.
It just cuts where it finds the separator character. Following is a CSV file which has a comma within the quotes: In the above CSV file, it can be seen that the first field has a comma within itself, hence closed within quotes. Following is the result of applying split function on such a file: In the above file, split function divided the string field into parts even if it was within quotes, also since, we were printing only three fields in our code, hence, the third field of the last string is dropped in the output file.
To handle such situations, some restrictions and scopes are added to Perl, these restrictions allow the compiler to skip the division of fields within quotes. These modules can be included in the Perl program with the use of the following pragma:.
But first, there is a need to download and install this module on your device to use its functionalities. Above line describes the way to call the constructor on the class. This call will try to parse the current line and will split it up to pieces. Return true or false depending on success or failure. Fields with embedded new-lines In a CSV file, there can also be some fields that are multi-lined or having a new line embedded between the words.
These kinds of multi-lined fields when passed through a split function work very differently in comparison to other files with no embedded new line. Each record is concatenated with a comma with join function and output.
Add a line break at the end. I'm using while statement to create a Perl data structure. At the end of the first while loop, it will be [masao, 10, Japan], but first create only the frame.
The following is a rewritten subroutine. As kits said, this one is much cleaner. Thank you very much.
0コメント