reading csv files in python


code examples for reading csv files from python. In the example we will import csv from the python standard library.

usmetro.csv

rank,area,state,population
1,"New York","NY",20320876
2,"Los Angeles","CA",13353907
3,"Chicago","IL",9533040
4,"Dallas-Fort Worth","TX",7399662
5,"Houston","TX",6892427
6,"Washington D.C.","DC",6216589
7,"Miami","FL",6158824
8,"Philadelphia","PA",6096120
9,"Atlanta","GA",5884736
10,"Boston","MA",4836531
11,"Phoenix","AZ",4737270
12,"San Francisco","CA",4727357

read the above csv files in python with csv.reader

csv_reader_example.py

import csv

with open('usmetro.csv') as csvfile:
    data = csv.reader(csvfile)

    # iterate each row
    for row in data:
        # each row is returned as a list
        print(type(row))
        # print the entire row
        print(row)
        # print the second column of each row, area
        print(row[1])

output

<class 'list'>
['rank', 'area', 'state', 'population']
area
<class 'list'>
['1', 'New York', 'NY', '20320876']
New York
<class 'list'>
['2', 'Los Angeles', 'CA', '13353907']
Los Angeles
...
<class 'list'>
['12', 'San Francisco', 'CA', '4727357']
San Francisco

reading a csv file with DictReader allows you to reference each column by name, assuming the csv file has a header row.

csv_dictreader_example.py

# making us of header rows
with open('usmetro.csv') as csvfile:
    data = csv.DictReader(csvfile)

    # iterate each row
    for row in data:
        # each row is now returned as Dict, specifically an OrderedDict
        print(type(row))
        # print the entire row
        print(row)
        # now we can reference each column by name
        print(row['area'])

output

<class 'collections.OrderedDict'>
OrderedDict([('rank', '1'), ('area', 'New York'), ('state', 'NY'), ('population', '20320876')])
New York
<class 'collections.OrderedDict'>
OrderedDict([('rank', '2'), ('area', 'Los Angeles'), ('state', 'CA'), ('population', '13353907')])
Los Angeles

...

<class 'collections.OrderedDict'>
OrderedDict([('rank', '12'), ('area', 'San Francisco'), ('state', 'CA'), ('population', '4727357')])
San Francisco

csv_dictreader_example2.py

# us csv.DictReader making us of header rows
with open('usmetro.csv') as csvfile:
    data = csv.DictReader(csvfile)

    # iterate each row
    for row in data:
        # an example sentence using fstrings that references all of the columns
        print( f"{row['area']}, {row['state']} is ranked #{row['rank']} and has a population of {format(int(row['population']),',')}" )

output

New York, NY is ranked #1 and has a population of 20,320,876
Los Angeles, CA is ranked #2 and has a population of 13,353,907
Chicago, IL is ranked #3 and has a population of 9,533,040
Dallas-Fort Worth, TX is ranked #4 and has a population of 7,399,662
Houston, TX is ranked #5 and has a population of 6,892,427
Washington D.C., DC is ranked #6 and has a population of 6,216,589
Miami, FL is ranked #7 and has a population of 6,158,824
Philadelphia, PA is ranked #8 and has a population of 6,096,120
Atlanta, GA is ranked #9 and has a population of 5,884,736
Boston, MA is ranked #10 and has a population of 4,836,531
Phoenix, AZ is ranked #11 and has a population of 4,737,270
San Francisco, CA is ranked #12 and has a population of 4,727,357