Histogram¶
This section showcases the histogram chart. It contains examples of how to create the histogram using the datachart.charts.Histogram function.
The examples sequentially build on each other, going from simple to complex.
As mentioned above, the histogram are created using the Histogram
function found in the datachart.charts module. Let's import it:
from datachart.charts import Histogram
Double figure generation avoidence
To avoid double figure generation, the Histogram
function is preceded by the _ =
operator. The double figures are generated because Histogram
returns the plt.Figure
object, which is then used to save the figure locally.
Histogram Input Attributes¶
The Histogram
function accepts the attributes of the datachart.typings.HistogramChartAttrs type. In a nutshell, the input is a dict
object containing the charts
attribute, which is either a dict
or a List[dict]
where each dictionary contains some of the following attributes:
{
"data": [{ # A list of histogram data points
"x": Union[int, float], # The x-axis value
}],
"style": { # The style of the histogram (optional)
"plot_hist_color": Union[str, None], # The color of the histogram
"plot_hist_alpha": Union[float, None], # The transparency of the histogram
"plot_hist_zorder": Union[int, float, None], # The z-order of the histogram
"plot_hist_fill": Union[str, None], # The fill color of the histogram
"plot_hist_hatch": Union[HATCH_STYLE, None], # The hatch pattern
"plot_hist_type": Union[str, None], # The type of the histogram
"plot_hist_align": Union[str, None], # The alignment of the histogram
"plot_hist_edge_width": Union[int, float, None], # The width of the histogram edge
"plot_hist_edge_color": Union[str, None], # The color of the histogram edge
},
"subtitle": Optional[str], # The title of the chart
"xlabel": Optional[str], # The x-axis label
"ylabel": Optional[str], # The y-axis label
"xticks": Optional[List[Union[int, float]]], # the x-axis ticks
"xticklabels": Optional[List[Union[str, float, str]], # the x-axis tick labels (must be same length as xticks)
"xtickrotate": Optional[int], # the x-axis tick labels rotation
"yticks": Optional[List[Union[int, float]]], # the y-axis ticks
"yticklabels": Optional[List[Union[str, float, str]], # the y-axis tick labels (must be same length as yticks)
"ytickrotate": Optional[int], # the y-axis tick labels rotation
"vlines": Optional[Union[dict, None]], # the vertical lines
"hlines": Optional[Union[dict, None]], # the horizontal lines
}
For more details, see the datachart.typings.HistogramChartAttrs type.
Single Histogram Chart¶
In this part, we show how to create a single histogram chart using the Histogram
function.
Let us first import the necessary libraries:
import random
Basic example. Let us first create a basic histogram chart showing a random distribution.
The following example shows how only the charts["data"]
attribute is required to draw the histgram chart.
NUM_OF_POINTS = 100
charts = {"data": {"x": [100 * idx * random.random() for idx in range(NUM_OF_POINTS)]}}
_ = Histogram(
{
# add the data to the chart
"charts": charts,
}
)
Chart title and axis labels¶
To add the chart title and axis labels, simply add the title
, xlabel
and ylabel
attributes.
_ = Histogram(
{
"charts": charts,
# add the title
"title": "Title",
# add the x and y axis labels
"xlabel": "the global x-axis label",
"ylabel": "the global y-axis label",
}
)
Number of bins¶
The histogram bins the values into a number of bins. By default, the number of bins is 20.
To change the number of bins, simply modify the num_bins
attribute.
_ = Histogram(
{
"charts": charts,
"title": "Title",
"xlabel": "the global x-axis label",
"ylabel": "the global y-axis label",
# change the number of bins
"num_bins": 40,
}
)
Figure size and grid¶
To change the figure size, simply add the figsize
attribute. The figsize
attribute can be a tuple (width, height), values are in inches. The datachart
package provides a datachart.constants.FIG_SIZE constant, which contains some of the predefined figure sizes.
To add the grid, simply add the show_grid
attribute. The possible options are:
Option | Description |
---|---|
"both" |
shows both the x-axis and the y-axis gridlines. |
"x" |
shows only the x-axis grid lines. |
"y" |
shows only the y-axis grid lines. |
Again, datachart
provides a datachart.constants.SHOW_GRID constant, which contains the supported options.
from datachart.constants import FIG_SIZE, SHOW_GRID
_ = Histogram(
{
"charts": charts,
"title": "Title",
"xlabel": "the global x-axis label",
"ylabel": "the global y-axis label",
# add to determine the figure size
"figsize": FIG_SIZE.A4_NARROW,
# add to show the grid lines
"show_grid": SHOW_GRID.Y,
"num_bins": 40,
}
)
Histogram style¶
To change a single histogram style simply add the style
attribute with the corresponding attributes. The supported attributes are shown in the datachart.typings.HistStyleAttrs type, which contains the following attributes:
Attribute | Description |
---|---|
"plot_hist_color" |
The color of the histogram (hex color code). |
"plot_hist_alpha" |
The alpha of the histogram (how visible the histogram is). |
"plot_hist_zorder" |
The zorder of the histogram. |
"plot_hist_fill" |
The fill color of the histogram (hex color code). |
"plot_hist_hatch" |
The hatch pattern of the histogram. |
"plot_hist_type" |
The type of the histogram. |
"plot_hist_align" |
The alignment of the histogram. |
"plot_hist_edge_width" |
The width of the histogram edge. |
"plot_hist_edge_color" |
The color of the histogram edge (hex color code). |
Again, to help with the style settings, the datachart.constants module contains the following constants:
Constant | Description |
---|---|
datachart.constants.HATCH_STYLE | The hatch pattern of the histogram. |
datachart.constants.HISTOGRAM_TYPE | The type of the histogram. |
from datachart.constants import HISTOGRAM_TYPE, HATCH_STYLE
_ = Histogram(
{
"charts": {
"data": charts["data"],
# define the style of the histogram
"style": {
"plot_hist_hatch": HATCH_STYLE.DIAGONAL,
"plot_hist_fill": True,
"plot_hist_edge_width": 1,
"plot_hist_edge_color": "#08519c",
"plot_hist_type": HISTOGRAM_TYPE.STEP,
},
},
"title": "Title",
"xlabel": "the global x-axis label",
"ylabel": "the global y-axis label",
"figsize": FIG_SIZE.A4_NARROW,
"show_grid": SHOW_GRID.Y,
"num_bins": 40,
}
)
Histogram orientation¶
To change the orientation of the histograms, simply add the orientation
attribute, which supports the following values:
Value | Description |
---|---|
"horizontal" |
The histogram are horizontal. |
"vertical" |
The histogram are vertical. |
Again, to help with the style settings, the datachart.constants module contains the following constants:
Constant | Description |
---|---|
datachart.constants.ORIENTATION | The orientation of the histogram. |
from datachart.constants import ORIENTATION
_ = Histogram(
{
"charts": {
"data": charts["data"],
"style": {
"plot_hist_hatch": HATCH_STYLE.DIAGONAL,
"plot_hist_fill": True,
"plot_hist_edge_width": 1,
"plot_hist_edge_color": "#08519c",
"plot_hist_type": HISTOGRAM_TYPE.STEP,
},
},
"title": "Title",
"xlabel": "the global x-axis label",
"ylabel": "the global y-axis label",
"figsize": FIG_SIZE.A4_NARROW,
# change the grid to match the change in orientation
"show_grid": SHOW_GRID.X,
# change the orientation of the bars
"orientation": ORIENTATION.HORIZONTAL,
"num_bins": 40,
}
)
Adding vertical and horizontal lines¶
Adding vertical lines. Within the charts
attribute, define the attribute vlines
with the datachart.typings.VLinePlotAttrs typing, which is either a dict
or a List[dict]
where each dictionary contains some of the following attributes:
{
"x": Union[int, float], # The x-axis value
"ymin": Optional[Union[int, float]], # The minimum y-axis value
"ymax": Optional[Union[int, float]], # The maximum y-axis value
"style": { # The style of the line (optional)
"plot_vline_color": Optional[str], # The color of the line (hex color code)
"plot_vline_style": Optional[LineStyle], # The line style (solid, dashed, etc.)
"plot_vline_width": Optional[float], # The width of the line
"plot_vline_alpha": Optional[float], # The alpha of the line (how visible the line is)
},
"label": Optional[str], # The label of the line
}
Adding horizontal lines. Within the charts
attribute, define the attribute hlines
, with the datachart.typings.HLinePlotAttrs typing, which is either a dict
or a List[dict]
where each dictionary contains some of the following attributes:
{
"y": Union[int, float], # The y-axis value
"xmin": Optional[Union[int, float]], # The minimum x-axis value
"xmax": Optional[Union[int, float]], # The maximum x-axis value
"style": { # The style of the line (optional)
"plot_hline_color": Optional[str], # The color of the line (hex color code)
"plot_hline_style": Optional[LineStyle], # The line style (solid, dashed, etc.)
"plot_hline_width": Optional[float], # The width of the line
"plot_hline_alpha": Optional[float], # The alpha of the line (how visible the line is)
},
"label": Optional[str], # The label of the line
}
To add vertical and horizontal lines, simply add the vlines
and hlines
attributes into the input dictionary.
from datachart.constants import LINE_STYLE
_ = Histogram(
{
"charts": {
"data": charts["data"],
# add a list of vertical lines
"vlines": [
{
"x": 2000 * i,
"style": {
"plot_vline_color": "green",
"plot_vline_style": LINE_STYLE.SOLID,
"plot_vline_width": 1,
},
}
for i in range(1, 4)
],
# add a list of horizontal lines
"hlines": {
"y": 6,
"style": {
"plot_hline_color": "red",
"plot_hline_style": LINE_STYLE.DASHED,
"plot_hline_width": 2,
"plot_hline_alpha": 0.5,
},
},
},
"title": "Title",
"xlabel": "the global x-axis label",
"ylabel": "the global y-axis label",
"figsize": FIG_SIZE.A4_NARROW,
"show_grid": SHOW_GRID.Y,
"num_bins": 40,
}
)
Multiple Histogram Charts¶
To add multiple histogram charts, simply add the charts
attribute with a list of charts, as shown below.
Attributes same as creating a single chart
We designed the datachart.charts.*
functions to use the same attribute naming when possible. To create multiple charts, the charts
attribute becomes a list of dictionaries with the same attributes as when creating a single chart.
# the charts data is now a list of charts
charts = [
{
"data": {"x": [100 * idx * random.random() for idx in range(NUM_OF_POINTS)]},
}
for _ in range(2)
]
_ = Histogram(
{
# use a list of charts to define multiple histograms
"charts": charts,
"title": "Title",
"xlabel": "the global x-axis label",
"ylabel": "the global y-axis label",
"figsize": FIG_SIZE.A4_NARROW,
"show_grid": SHOW_GRID.Y,
"num_bins": 40,
}
)
Sub-chart subtitles¶
We can name each chart by adding the subtitle
attribute to each chart. In addition, to help with discerning which chart is which, use the show_legend
attribute to show the legend of the charts.
charts = [
{
**chart,
# add a subtitle to the chart (see below for more info)
"subtitle": f"Histogram {idx+1}",
}
for idx, chart in enumerate(charts)
]
_ = Histogram(
{
"charts": charts,
"title": "Title",
"xlabel": "the global x-axis label",
"ylabel": "the global y-axis label",
"figsize": FIG_SIZE.A4_NARROW,
"show_grid": SHOW_GRID.Y,
"num_bins": 40,
# show the legend
"show_legend": True,
}
)
Subplots¶
To draw multiple charts in each subplot, simply add the subplots
attribute. The chart's subtitle
are then added at the top of each subplot, while the title
, xlabel
and ylabel
are positioned to be global for all charts.
_ = Histogram(
{
"charts": charts,
"title": "Title",
"xlabel": "the global x-axis label",
"ylabel": "the global y-axis label",
"figsize": FIG_SIZE.A4_NARROW,
"show_grid": SHOW_GRID.Y,
"num_bins": 40,
# show each chart in its own subplot
"subplots": True,
}
)
Sharing the x-axis and/or y-axis across subplots¶
To share the x-axis and/or y-axis across subplots, simply add the sharex
and/or sharey
attributes, which are boolean values that specify whether to share the axis across all subplots.
_ = Histogram(
{
"charts": charts,
"title": "Title",
"xlabel": "the global x-axis label",
"ylabel": "the global y-axis label",
"figsize": FIG_SIZE.A4_NARROW,
"show_grid": SHOW_GRID.Y,
"num_bins": 40,
"subplots": True,
# share the x-axis across subplots
"sharex": True,
# share the y-axis across subplots
"sharey": True,
}
)
Subplot orientation¶
The orientation
attribute can be used to change the orientation of all subplots.
_ = Histogram(
{
"charts": charts,
"title": "Title",
"xlabel": "the global x-axis label",
"ylabel": "the global y-axis label",
"figsize": FIG_SIZE.A4_NARROW,
"num_bins": 40,
"subplots": True,
"sharex": True,
"sharey": True,
# change the grid to match the change in orientation
"show_grid": SHOW_GRID.X,
# change the orientation of the histogram
"orientation": ORIENTATION.HORIZONTAL,
}
)
Histogram Views¶
Density distribution view¶
To show the histograms as a density distribution, simply add the show_density
attribute. This will scale the histogram bars to represent the density of the data, instead of the frequency of the data. Thus, the y-axis (or x-axis in horizontal
orientation) will be between 0 and 1.
_ = Histogram(
{
"charts": charts,
"title": "Title",
"xlabel": "the global x-axis label",
"ylabel": "the global y-axis label",
"figsize": FIG_SIZE.A4_NARROW,
"show_grid": SHOW_GRID.Y,
"subplots": True,
"sharex": True,
"sharey": True,
"num_bins": 40,
# shows the density of the data
"show_density": True,
}
)
Cumulative distribution view¶
To show the histograms as a cumulative distribution, simply add the show_cumulative
attribute. This will show the cumulative distribution of the data, i.e. the height of each bar will be the sum of the previous bars.
_ = Histogram(
{
"charts": charts,
"title": "Title",
"xlabel": "the global x-axis label",
"ylabel": "the global y-axis label",
"figsize": FIG_SIZE.A4_NARROW,
"show_grid": SHOW_GRID.Y,
"subplots": True,
"sharex": True,
"sharey": True,
"num_bins": 40,
# shows the cumulative distribution
"show_cumulative": True,
}
)
Cumulative & density distribution view¶
One can also combine the show_density
and show_cumulative
attributes. This will show the density of the data and the cumulative distribution of the data.
_ = Histogram(
{
"charts": charts,
"title": "Title",
"xlabel": "the global x-axis label",
"ylabel": "the global y-axis label",
"figsize": FIG_SIZE.A4_NARROW,
"show_grid": SHOW_GRID.X,
"subplots": True,
"sharex": True,
"sharey": True,
"num_bins": 40,
# shows the density of the data
"show_density": True,
# shows the cumulative distribution
"show_cumulative": True,
}
)
Axis scales¶
The user can change the axis scale using the scalex
and scaley
attributes. The supported scale options are:
Options | Description |
---|---|
"linear" |
The linear scale. |
"log" |
The log scale. |
"symlog" |
The symmetric log scale. |
"asinh" |
The asinh scale. |
Again, to help with the options settings, the datachart.constants module contains the following constants:
Constant | Description |
---|---|
datachart.constants.SCALE | The axis options. |
from datachart.constants import SCALE
To showcase the supported scales, we iterate through all of the scales options.
for scale in [SCALE.LINEAR, SCALE.LOG, SCALE.SYMLOG, SCALE.ASINH]:
figure = Histogram(
{
"charts": charts,
"title": f"Graph showcasing the '{scale}' scale",
"xlabel": "the global x-axis label",
"ylabel": "the global y-axis label",
"figsize": FIG_SIZE.A4_NARROW,
"show_grid": SHOW_GRID.X,
"subplots": True,
"sharex": True,
"sharey": True,
"num_bins": 40,
# set the scale of the x and y axes
"scaley": scale,
}
)
Saving the Chart as an Image¶
To save the chart as an image, use the datachart.utils.save_figure function.
from datachart.utils import save_figure
save_figure(figure, "./fig_histogram.png", dpi=300)
The figure should be saved in the current working directory.