Blog Post #147: Reading from Files: .read(), .readline(), and .readlines()

In Post #146, we learned how to open a file in “read” mode ('r') using the with open() statement. Now that we have an open file object, how do we actually get the content out of it? Python’s file objects provide several methods for reading data.

In this post, we’ll explore the three primary methods for reading from a text file: .read(), .readline(), and .readlines(). We’ll compare their behavior and discuss the important memory implications of each.

Setting Up Our Example File

Before we can read a file, we need one to exist. Create a new file in the same directory as your Python script and name it poem.txt. Add the following content to it:

Roses are red,
Violets are blue,
Python is awesome,
And so are you.

We will use this file for all the examples below.

.read(): Reading the Entire File

The .read() method is the simplest way to get the content of a file. It reads everything from the current position to the end of the file and returns it as a single string, including all the newline characters (\n).

with open("poem.txt", "r") as file:
    content = file.read()

print("--- Content from .read() ---")
print(content)
print("--- End of Content ---")

The output will be the entire file as one string:

--- Content from .read() ---
Roses are red,
Violets are blue,
Python is awesome,
And so are you.

--- End of Content ---

Memory Warning: This method is simple, but it reads the entire file into memory at once. This is perfectly fine for small text files, but it’s a very bad idea for large files (e.g., gigabyte-sized log files), as it can consume all your computer’s RAM, a problem we first discussed in Post #125.

.readline(): Reading One Line at a Time

The .readline() method is more fine-grained. It reads just one single line from the file, up to and including the newline character (\n) at the end of the line. Each time you call it, it reads the next line in sequence.

When it reaches the end of the file, subsequent calls will return an empty string ("").

with open("poem.txt", "r") as file:
    line1 = file.readline()
    line2 = file.readline()

# We use .strip() to remove the trailing newline for cleaner printing
print(f"Line 1: {line1.strip()}")
print(f"Line 2: {line2.strip()}")

The output shows the first two lines:

Line 1: Roses are red,
Line 2: Violets are blue,

Memory Advantage: This method is very memory-efficient because it only ever holds one line in memory at a time, making it suitable for reading large files.

.readlines(): Reading All Lines into a List

The .readlines() method reads all the remaining lines in a file and returns them as a list of strings. Each string in the list represents one line from the file and includes the trailing newline character.

with open("poem.txt", "r") as file:
    lines_list = file.readlines()

print(lines_list)

The output is a list where each element is a line from the file:

['Roses are red,\n', 'Violets are blue,\n', 'Python is awesome,\n', 'And so are you.']

Memory Warning: Like .read(), this method loads the entire file into memory at once. It can consume even more memory than .read() because of the overhead of creating a list and many small string objects. You should also avoid this method for very large files.

What’s Next?

We now have three methods for reading files:

  • .read(): Reads the whole file into one string. Good for small files.
  • .readline(): Reads one line at a time. Memory-efficient, but can be clumsy to use in a loop.
  • .readlines(): Reads the whole file into a list of strings. Good for small files.

We’ve seen that .read() and .readlines() can be dangerous for large files, and calling .readline() repeatedly can be awkward. So, what is the best, most Pythonic way to process a file line-by-line? In Post #148, we will learn how to iterate directly over a file object in a for loop, which is the most efficient and readable pattern for the job.

Author

Debjeet Bhowmik

Experienced Cloud & DevOps Engineer with hands-on experience in AWS, GCP, Terraform, Ansible, ELK, Docker, Git, GitLab, Python, PowerShell, Shell, and theoretical knowledge on Azure, Kubernetes & Jenkins. In my free time, I write blogs on ckdbtech.com

Leave a Comment