In this article, you will see how to draw line plots using Python’s seaborn library. The seaborn library allows you to visualize data with the help of different types of plots such as scatter plot, box plot, bar plot, count plot, heat maps, line plots, etc. This article will be focusing on line plots. A line plot is used to plot relationships between two numeric variables. The line plot shows how values on the y-axis in a 2-dimensional graph are affected by an increase or decrease in the values on the x-axis.
Table of Contents
- Installing Seaborn And Importing Required Libraries
- Plotting Line Plots Using NumPy Arrays
- Plotting Line Plots Using Pandas
- Removing the Confidence Interval from a Line Plot
- Plotting Multiple Line Plots using Pandas
- Changing Colour of Line Plots
- Plotting Dashed Line Plots
- Adding Markers to Line Plots
Installing Seaborn And Importing Required Libraries
To install Python’s seaborn library, open your command terminal and run this command:
pip install seaborn
It is pertinent to mentions that the seaborn library is built on top of Python’s Matplotlib library. Therefore, you must have installed matplotlib before you can work with seaborn.
The following script imports the required libraries:
import seaborn as sns import matplotlib.pyplot as plt import numpy as np %matplotlib inline sns.set_style("darkgrid")
The following script increases the default plot size to 10 inches wide and 8 inches high.
plt.rcParams["figure.figsize"] = [10, 8]
Plotting Line Plots Using NumPy Arrays
You can plot a line plot by passing two numpy arrays containing values for variables that you want to plot on the x and y axes. Here is an example:
x = np.arange(-20,21) print(x) y = np.array(x * x) print(y) sns.lineplot(x=x,y=y)
The script above creates two numpy arrays: x and y. You can change the names of the arrays if you want. The x array contains integers from -20 to 20. The y array contains squares of all the items in the x array. Next, the lineplot() method of the sns (the alias for the seaborn library) is called and the x array is passed to the x attribute, while the y array is passed to the y attribute of the lineplot() method. Here is the output for the above script:
[-20 -19 -18 -17 -16 -15 -14 -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20] [400 361 324 289 256 225 196 169 144 121 100 81 64 49 36 25 16 9 4 1 0 1 4 9 16 25 36 49 64 81 100 121 144 169 196 225 256 289 324 361 400]
The output line plot shows a typical square function.
Plotting Line Plots Using Pandas
You can also draw seaborn line plots using Pandas dataframes. To plot a seaborn line plot using Pandas dataframe, the names of the dataframe columns that you want to plot on the x-axis and y-axis are passed to the x and y attributes of the lineplot() function, respectively. In addition, you need to pass the dataframe name to the data attribute of the lineplot() function.
The seaborn library comes with built-in datasets that you can import into a Pandas dataframe. The following script imports the “flights” dataset into the “flight_data” dataframe and prints the first five rows of the dataset.
flight_data = sns.load_dataset("flights") flight_data.head()
The output below shows that the dataset contains three columns: year, month, and passengers.
Let’s plot a seaborn line plot which displays the relationship between years and number of passengers.
sns.lineplot(x='year',y='passengers', data = flight_data)
The output shows that from the year 1949 to 1960, the number of passengers who traveled by air increased almost linearly. The shaded region around the line plot is the confidence interval. The next section shows how you can remove the confidence interval from a line plot.
Removing the Confidence Interval from a Seaborn Line Plot
To remove the confidence interval, you need to pass “None” as the value for the ci attribute of the lineplot() function as shown in the following script:
plt.rcParams["figure.figsize"] = [10, 8] sns.lineplot(x='year',y='passengers', data = flight_data, ci = None)
The output below shows that the confidence interval has now been removed.
Plotting Multiple Line Plots using Pandas
If you are using a Pandas dataframe, you can plot multiple line plots, one each for every unique value in a categorical column. For instance, if you want to plot 12 line plots, one for each month of the year, you need to pass the name of the column i.e. “month” as a parameter to the hue attribute of the lineplot() function. Look at the following script for reference.
sns.lineplot(x='year',y='passengers', hue = 'month', data = flight_data, ci = None)
In the output below, you can see 12 line plots, one for each month from the years 1949 to 1960. The output shows that overall the most number of people travel in the months of June, July, and August which is understandable since this is the vacation period.
Changing Color of Line Plots
You can also change the colors of your line plots. To do so, you need to pass a palette value to the palette attribute of the lineplot() function. The following script sets the value of the palette attribute to bright.
sns.lineplot(x='year',y='passengers', hue = 'month', data = flight_data, ci = None, palette = "bright")
In the output below, you can see that the colors have been updated. To see the complete list of color palette options, check the official documentation.
Plotting Dashed Line Plots
In addition to changing colors, you can also plot dashed line plots to differentiate multiple line plots. To do so, you need to pass the categorical column containing values for multiple line plots. It is important to mention that the style attribute doesn’t work if you have more than 6 categories, in your Pandas column. Therefore, for the sake of demonstration, the following script removes the records for the months of January to June.
flight_data6 = flight_data[(flight_data['month']!= "January") & (flight_data['month']!= "February") & (flight_data['month']!= "March") & (flight_data['month']!= "April") & (flight_data['month']!= "May") & (flight_data['month']!= "June")]
Next, to pot dashed line plot, pass “month” as a value for the style attribute as shown below.
sns.lineplot(x='year',y='passengers', hue = 'month', data = flight_data6, ci = None, palette = "bright", style = 'month')
The output below shows a dashed line plot. The legends for different types of dashes is also shown at the top-left corner of the plot.
Adding Markers to Line Plots
Finally, to enhance the visibility of your line plots, you can also add markers to your line plots. You need to pass the marker symbol to the marker attribute of the lineplot() function. The following script plot multiple line plots with circular markers.
sns.lineplot(x='year',y='passengers', hue = 'month', data = flight_data6, ci = None, palette = "bright", marker = "o")
In the output below, you can see that markers have been added at each point where x and y axes intersect.