Python Virtual Environment

In this post I want to talk about virtual environments in Python and why we need them. This is especially important if you are using Python to implement a research idea.
In this post, I want to talk about virtual environments in Python and why we need them. This is especially important if you are a researcher working on multiple projects and you are using Python to implement your methods. To follow the instructions, make sure you are using Python 3.x. The method is applicable to all operating systems, with minimal changes to the way you run the Python command.
If you are a Windows user, I would highly recommend getting familiar with Unix commands, especially if you are a developer! To do so, you can setup and use the Windows Subsystem for Linux (WSL), available on Windows 10 and later.
What is a Virtual Environment?
When you install Python and its package manager, pip, on your system, both become accessible system-wide. This means you can use python3 or pip3 from any location. Consequently, packages installed with pip3 are also installed globally, making them available to all projects. This approach has several drawbacks:
- Unnecessary dependencies: Not all projects require every installed package.
- Version conflicts: Different projects might need different versions of the same package.
- Poor code sharing: Sharing your code and its environment with others becomes very challenging.
To address these issues, it is better to create and use a separate virtual environment for each project. Python offers a venv module for this purpose. Each virtual environment maintains its own set of installed packages while still leveraging the underlying Python installation. For instance, to create a project directory named sample_ml_project and its corresponding virtual environment, you would execute the following commands in your terminal (note: $ signifies the beginning of a command line, thus it is not part of the command):
$ mkdir sample_ml_project $ cd sample_ml_project $ python3 -m venv myenv
The code snippet above, first creates a project directory named sample_ml_project and then navigates to that directory (using the cs command). Finally, it creates a virtual environment within the project directory, naming it myenv (you can choose any name that suits your preference). This results in the creation of a myenv directory within your project.
To begin using this newly created virtual environment, you must activate it. This can be achieved by executing the following command:
$ source myenv/bin/activate
Once activated, the environment functions similarly to your standard Python and pip setup, but all package installations are isolated within the environment. To deactivate the environment, simply execute:
$ deactivate
Although the above commands are used in Unix-based systems, they are similar in Windows-based systems. For more details, you can check the following link https://docs.python.org/3/library/venv.html.
The Requirements File
To ensure others can effectively replicate your project's environment, it's crucial to document the installed packages along with their versions. This information can be conveniently managed within a file named requirements.txt placed in your project's root directory.
You can manually populate this file or utilize the pip freeze command (while the virtual environment is active) for automated generation in the following way:
$ pip3 freeze > requirements.txt
If you need more information, you can visit the following link https://pip.pypa.io/en/stable/cli/pip_freeze/#examples.
Once the requirements.txt file has been generated, anyone can install the listed packages using:
$ pip3 install -r requirements.txt