Postgres and pgAdmin on Docker: A Tutorial on Windows

Sonu Ranjit Jacob
8 min readFeb 1, 2023

This tutorial will show you how to:

  1. Create a Postgres database in a Docker container
  2. Download data from a website and save it into the Postgres database hosted on a Docker container using Python
  3. View and access this data using pgAdmin using another Docker container that connects to the docker container in step 1.

Prerequisites: Docker Desktop, Git Bash, Python, wget

I recommend getting the latest version of wget to prevent any SSL connection issues — you can uninstall your current version and install it according to the instructions here. I also recommend typing out any Git Bash commands yourself! Skip to step 6 if you’re only looking for the code that combines the Docker containers, read through the article if you want to understand how it works :)

Step 0: Create the following folder structure (or just fork my repository)

The files in the folder will be explained in the next steps, you can create empty files if you are not forking the repository.

----Docker-Postgres-pgAdmin
----ingest_data.py
----Dockerfile
----docker-compose.yaml
----name_gender_data
  • ingest_data.py is a Python script we will use to ingest data into the Docker container
  • name_gender_data is an empty folder.

Step 1: Creating a docker container that contains Postgres

We download a docker image that contains Postgres. We do this by executing the following command in Git Bash.

docker run -it \
-e POSTGRES_USER="root" \
-e POSTGRES_PASSWORD="root" \
-e POSTGRES_DB="name_gender" \
-v "C:\Sonu_Local\Docker-Postgres-pgAdmin\name_gender_data":/var/lib/postgresql/data \
-p 5432:5432 \
--network=pg-network \
--name pg-database \
postgres:13

Explanation of the arguments:

  • -it ensures that the shell is interactive, i.e you can shut it down by using Ctrl+C
  • POSTGRES_USER is the name of the user. I have named it root
  • POSTGRES_PASSWORD is the password, also named root for convenience. Use a different username and password for critical projects.
  • POSTGRES_DB indicates the name of the database.
  • -v is used to mount the data in the container locally. This is done to ensure our data persists and is not discarded or lost when we stop the docker container. C:\Sonu_Local\Docker-Postgres=pgAdmin\name_gender_data is the path on our local system and var/lib/postgresql/datais the path where the data is stored in the Docker container.
  • -p is used to indicate which port to use for the connection to Postgres. We will use port 5432 for Postgres.
  • --network=pg-network is the network used for our Docker containers to communicate with each other on the same Docker host.
  • — name=pg-database is the name used to refer to the Docker container that we just created aka the Docker container containing Postgres
  • postgres:13is the name of the docker image we will download.

Once we run the command in a Git Bash terminal, we will get the following output which shows that the database is ready to accept connections.

This shows that the database is ready to accept connections.

Step 2: Picking a dataset

Now, we will pick a dataset available on a website. I have chosen the ‘Gender by Name’ dataset as I found it quite interesting (and also because I wish my parents had checked this before naming me haha). I took it from the UCI Machine Learning repository. We get the link by clicking on the Data Folder and right-clicking on the name_gender_dataset.csv and clicking on Copy link address.Save this link address for the next step.

Step 3: Downloading the dataset/Ingesting the data into the Postgres Docker container using Python

We need to use the settings and credentials of the Postgres database and account we set up in step 1 to save the dataset into the database:

  • userandpassword — the credentials of the created Postgres account which will be root
  • hostwhich will belocalhostas it will run locally.
  • portwhich will be 5432
  • dbwhich will be name_gender
  • table_name — we can give the name we want for the table. I will set it to name_gender_table (Not really going for originality here haha)
  • url — The url that we got in step 2.

We pass these values as arguments to the main function using the library argparse.Next, we download the dataset using the commandwgetas we use a Docker image with a Linux OS, as shown in line 21. We create an engine to directly interact with the database as shown in line 23. We insert the insert the column names first, indicated by df.head(n=0)in line 30 and the insert the rest of the data in line 36. We log each step using thelogging library.

Before running this python script, ensure you have the required library sqlalchemyinstalled. You can install it using pip install sqlalchemy.

We run the python script along with its arguments as shown in a 2nd Git Bash terminal:

python ingest_data.py \
--user=root \
--password=root \
--host=localhost \
--port=5432 \
--db=name_gender \
--table_name=name_gender_table \
--url="https://archive.ics.uci.edu/ml/machine-learning-databases/00591/name_gender_dataset.csv"

Note: If you get the error Unable to establish SSL connection,fret not. You probably need to get the latest version of wget here.

Step 4: Dockerizing the ingestion script

In the previous step, we directly ran a python script from our local system to ingest data into our Postgres Docker container. To make it portable, we will create a Docker container that can run Python, eliminating the need to install the required libraries (like sqlalchemy) on our local environment.

We create a Dockerfile,use the RUNcommand to install the required libraries and the COPYcommand to copy our python script from step 3 into the container. We also specify the ENTRYPOINTas our script so that it is executed once our container is created.

FROM python:3.9  

RUN apt-get install wget
RUN pip install pandas sqlalchemy psycopg2

WORKDIR /app
COPY ingest_data.py ingest_data.py

ENTRYPOINT ["python", "ingest_data.py"]

We have to build this Docker container by running the following command on Git Bash

docker build -t gender_names_ingest:v001 .

Once we build it, we get the container up and running with the command below. Note that instead of steps 1 and 3, we will now run the docker command with the arguments to the ingest_data.py file.

Note that we specify the network as pg-network and the host as pg-database so that our Docker container knows to connect to the Postgres Docker container created in step 1.

docker run -it \
--network=pg-network \
gender_names_ingest:v001 \
--user=root \
--password=root \
--host=pg-database \
--port=5432 \
--db=name_gender \
--table_name=name_gender_table \
--url="https://archive.ics.uci.edu/ml/machine-learning-databases/00591/name_gender_dataset.csv"

If we look at our Docker Desktop, we see that there are two containers created, one for our Postgres database and one for our python script.

Step 5: Creating a docker container that contains pgAdmin

Now that we have our data in a database and a docker container that runs a script to automatically ingest our data, it would be nice to have a GUI to view and confirm our data exists and perform some analysis on our data. For this, let’s download another docker image which has pgAdmin, an open source platform and a web based administration tool for PostgreSQL databases.

We run the following command in a 3rd Git Bash terminal:

docker run -it \
-e PGADMIN_DEFAULT_EMAIL="admin@admin.com" \
-e PGADMIN_DEFAULT_PASSWORD="root" \
-p 8080:80 \
--network=pg-network \
--name pgadmin \
dpage/pgadmin4

Here,

  • PGADMIN_DEFAULT_EMAIL and PGADMIN_DEFAULT_PASSWORDis used to set the credentials of our pgAdmin account
  • -p is used to set the port for pgAdmin
  • We use the same network specified in step 1, i.e the pg-network
  • — name=pgadmin sets the name of the created Docker container to pgadmin

To set up the database on pgAdmin:

  1. Navigate to the link localhost:8080and log in using the credentials admin@admin.com and root as shown below.

2. Right click on ‘Servers →Register →Server’.

3. Enter a name in the ‘Name’ section. I have named it ‘Docker-Postgres-pgAdmin’

4. Click on ‘Connection’ and set the hostname to ‘pg-database’ as this is the name we used for the Postgres Docker container created in step 1. Enter the username and password also from step 1, which is root and click on ‘Save’.

5. To view the data, click on ‘Docker-Postgres-pgAdmin → Databases →name_gender →Schemas →public →Tables →name_gender_table’. Right click on ‘name_gender_table’ and navigate to ‘View/Edit data → First 100 rows’

Step 6: Combining the three containers in a single network

Phew. That was quite a journey. But we still have a last step to make our lives a little bit easier.

It is inconvenient to open multiple shells and run commands for each Docker container. Instead, we can use a docker-compose file to do run all three of our containers as shown below.

services:
pg-database:
image: postgres:13
environment:
- POSTGRES_USER=root
- POSTGRES_PASSWORD=root
- POSTGRES_DB=name_gender
volumes:
- "./name_gender_data:/var/lib/postgresql/data:rw"
ports:
- "5432:5432"
pgadmin:
image: dpage/pgadmin4
environment:
- PGADMIN_DEFAULT_EMAIL=admin@admin.com
- PGADMIN_DEFAULT_PASSWORD=root
ports:
- "8080:80"
pythoncontainer:
build: .
command: " --user=root --password=root --host=pg-database --port=5432 --db=name_gender --table_name=name_gender_table --url='https://archive.ics.uci.edu/ml/machine-learning-databases/00591/name_gender_dataset.csv' "

Here, we define 3 services — one for each container i.e

  • pg-database for the Postgres container
  • pgadmin for the pgAdmin container
  • pythoncontainer for the Python container. We use a Dockerfile for the Python container to build it and pass our arguments.

Now to set it up, you only need to run the command docker-compose up on Git Bash! If you check your Docker Desktop, you can see that there are three containers created:

You can check the logs to see if it is running correctly.

Once you get the message Booting worker with pid, you can navigate to localhost:8080, repeat the steps in step 4 to register your server on pgAdmin and view your data!

Hoorah!

If you made it to till here, great job and hope you had fun on this long journey! Credits to Alexey Grigorev’s DE-Zoomcamp for the basics!

--

--