4.5. covid19_stats.engine.viz module

This module provides visualization methods for COVID-19 cumulative cases and deaths for MSAs, states, and the CONUS. These command line interfaces – covid19_create_movie_or_summary, covid19_state_summary, and covid19_movie_updates – are front-ends to the methods in this module.

covid19_stats.engine.viz.create_and_draw_fromfig(fig, bbox, river_linewidth=5, river_alpha=0.3, coast_linewidth=2, coast_alpha=0.4, drawGrid=True, mult_bounds_lat=1.05, mult_bounds_lng=1.05, rows=1, cols=1, num=1)

This creates an GeoAxes, with lots of physical geographic features, and optional (but turned on by default) latitude and longitude gridding, of a region specified by a bounding box. This uses stereographic projection. For example, here is the GeoAxes displaying the CONUS.

../_images/viz_create_and_draw_fromfig_conus.png

Fig. 4.4 Demonstrations of this functionality, which underlies (or overlays?) the geographical features for visualizing COVID-19 cases and deaths.

Here are the arguments.

Parameters:
  • fig – the Figure onto which to create a GeoAxes containing geographic features. Last three arguments – rows, cols, and num – describe the relative placement of the created GeoAxes. See add_subplot for those three arguments’ meanings.

  • bbox (tuple) – a four-element tuple. Elements in order are minimum longitude, minimum latitude, maximum longitude, and maximum latitude.

  • river_linewidth (int) – the width, in pixels, of river geographical features.

  • river_alpha (float) – the color alpha of river geographical features.

  • coast_linewidth (int) – the width, in pixels, of the coast lines.

  • coast_alpha (float) – the color alpha of coast lines.

  • drawGrid (bool) – if True, then overlay the latitude and longitude grid lines. Otherwise do not. Default is True.

  • mult_bounds_lat (float) – often times, especially with geographic regions that cover a significant area of the earth, we need to put a multiplier \(> 1\) on the latitudinal extent of the plot, so that all features can be seen. By default this value is 1.05, but it must be \(\ge 1\).

  • mult_bounds_lng (float) – often times, especially with geographic regions that cover a significant area of the earth, we need to put a multiplier \(> 1\) on the longitudinal extent of the plot, so that all features can be seen. By default this value is 1.05, but it must be \(\ge 1\).

  • rows (int) – the number of rows for axes in the Figure grid. Must be \(\ge 1\), and by default is 1.

  • cols (int) – the number of columns for axes in the Figure grid. Must be \(\ge 1\), and by default is 1.

  • num (int) – the plot number of the GeoAxes in this Figure grid. Must be \(\ge 1\) and \(\le\)rows times columns. Its default is 1. Look at add_subplot for its meaning.

Return type:

GeoAxes

covid19_stats.engine.viz.create_plots_daysfrombeginning(inc_data, regionName, prefix, days_from_beginning=[0], dirname='/usr/WS2/islam5/covid19_stats/docsrc')

Creates a collection of quad PNG images (see Section 3.2.3 or Section 3.3.2) representing state of cumulative COVID-19 cases and deaths for a geographical region. Like movie mode in covid19_create_movie_or_summary or state movie mode, the four quadrants are,

  • upper left is the summary information for the geographical region.

  • lower left is the running tally of cumulative cases and deaths, by day from first incident.

  • upper right is the logarithmic coloration of cumulative deaths, by day from first incident.

  • lower right is the logarithmic coloration of cumulative cases, by day from first incident.

create_summary_movie_frombeginning uses this functionality in a multiprocessing fashion to create MP4 movie files for geographical regions. It is easier to show rather than tell. Fig. 4.5 is a quad plot of cumulative COVID-19 cases and deaths for the NYC metro area, 150 days after this metro’s first COVID-19 incident, that is created by this function.

../_images/covid19_nyc_LATEST.0150.png

Fig. 4.5 Quad plot of cumulative COVID-19 cases and deaths for the NYC metro area, 150 days after its first incident. The name of the file is covid19_nyc_LATEST.0150.png.

The collection of PNG images that this method creates are auto-cropped and, where needed, resized so that their widths and heights are even numbers. FFmpeg, run through create_summary_movie_frombeginning, cannot create an MP4 from PNGs unless the images’ widths and heights are divisible by 2.

Parameters:
  • inc_data (dict) – the data for incidence of COVID-19 cases and deaths for a given geographical region. See get_incident_data for the format of the output data.

  • regionName (str) – the name of the region to display in title plots. For example, in Fig. 4.10, this is NYC Metro Area.

  • prefix (str) – the identifying name to put into the output PNG files. For example, in Fig. 4.5, the prefix is nyc, and the name of the file is covid19_nyc_LATEST.0150.png. If the prefix is conus, then this module creates plots appropriate for geographic regions (such as CONUS) that cover significant areas of the earth’s surface.

  • days_from_beginning (list) – the list of days to create quad PNG images. Must be nonempty, and every element must be \(\ge 0\). Default is [ 0, ].

  • dirname (str) – the directory into which to save the quad PNG images. The default is the current working directory.

Returns:

the list of filenames of PNG quad images that this method creates, into dirname. For example, in the method invocation shown in Fig. 4.5, days_from_beginning = [ 150, ], and the list this method returns is [ '<dirname>/covid19_nyc_LATEST.0150.png', ].

Return type:

list

covid19_stats.engine.viz.create_summary_cases_or_deaths_movie_frombeginning(inc_data, type_disp='cases', dirname='/usr/WS2/islam5/covid19_stats/docsrc', save_imgfiles=False)

This is the back-end method for movie cases deaths mode for covid19_create_movie_or_summary, and state movie cases deaths mode for covid19_state_summary. This creates an MP4 movie file of cumulative COVID-19 cases or deaths, with identifying metadata, for a given geographical region. Table 4.1 shows the resulting MP4 movie files, of cumulative COVID-19 cases and deaths, for the NYC metro area (top row), and the state of Virginia (bottom row).

Table 4.1 Latest cumulative COVID-19 cases, and deaths, for the NYC metro area and Virginia

NYC metro area, latest movie of COVID-19 cumulative cases

NYC metro area, latest movie of COVID-19 cumulative deaths

Virginia, latest movie of COVID-19 cumulative cases

Virginia, latest movie of COVID-19 cumulative deaths

Here are the arguments,

Parameters:
  • inc_data (dict) – the data for incidence of COVID-19 cases and deaths for a given geographical region. See get_incident_data for the format of the output data.

  • type_disp – if cases, then show cumulative COVID-19 cases. If deaths, then show cumulative COVID-19 deaths. Can only be cases or deaths.

  • dirname (str) – the directory into which to save the MP4 movie file, and optionally a zip archive of the PNG image files used to create the MP4 movie. The default is the current working directory.

  • save_imgfiles (bool) – if True, then will create a zip archive of the PNG image files used to create the MP4 movie. Its full name is <dirname>/covid19_<prefix>_<type_disp>_LATEST_imagefiles.zip. <dirname> is the directory to save the MP4 file, <prefix> is the region name prefix (for example nyc for the NYC metro area) located in inc_data['prefix'], and <type_disp> is either cases or death. The default is False.

Returns:

the base name of the MP4 movie file it creates. For example, if inc_data['prefix'] is nyc and type_disp is cases, this method returns covid19_nyc_cases_LATEST.mp4. This method also saves the MP4 file as <dirname>/covid19_nyc_cases_LATEST.mp4, where <dirname> is the directory to save the MP4 file.

Return type:

str

covid19_stats.engine.viz.create_summary_movie_frombeginning(inc_data, dirname='/usr/WS2/islam5/covid19_stats/docsrc', save_imgfiles=False)

This is the back-end method for movie mode for covid19_create_movie_or_summary, and state movie mode for covid19_state_summary. This creates an MP4 quad movie file of both cumulative COVID-19 cases and deaths for a geographical region, and optionally a zip archive of PNG images used to create the MP4 file. This uses create_plots_daysfrombeginning in a multiprocessing fashion, to create sub-collections of PNG quad images, and then collate them into an MP4 file using FFmpeg. Table 4.2 shows the resulting MP4 movie files, of cumulative COVID-19 cases and deaths, for the NYC metro area and the state of Virginia.

Table 4.2 Latest cumulative quad movies of COVID-19 for the NYC metro area and Virginia

NYC metro area, latest quad movie of COVID-19 cumulative cases and deaths

Virginia, latest quad movie of COVID-19 cumulative cases and deaths

Here are the arguments,

Parameters:
  • inc_data (dict) – the data for incidence of COVID-19 cases and deaths for a given geographical region. See get_incident_data for the format of the output data.

  • dirname (str) – the directory into which to save the MP4 movie file, and optionally a zip archive of the PNG image files used to create the MP4 movie. The default is the current working directory.

  • save_imgfiles (bool) – if True, then will create a zip archive of the PNG image files used to create the MP4 movie. Its full name is <dirname>/covid19_<prefix>_LATEST_imagefiles.zip. <dirname> is the directory to save the MP4 file, and <prefix> is the region name prefix (for example nyc for the NYC metro area) located in inc_data['prefix']. The default is False.

Returns:

the base name of the MP4 movie file it creates. For example, if inc_data['prefix'] is nyc, this method returns covid19_nyc_LATEST.mp4. This method also saves the MP4 file as <dirname>/covid19_nyc_LATEST.mp4, where <dirname> is the directory to save the MP4 file.

Return type:

str

covid19_stats.engine.viz.display_fips(collection_of_fips, fig, **kwargs)

Method that is very similar to display_fips_geom, except this also displays the FIPS code of each county. For example, for Rhode Island, this is.

../_images/viz_display_fips_rhodeisland.png

Fig. 4.6 Demonstration of this method showing the counties in Rhode Island. The FIPS code of each county is shown in red. One can extract the patches in this object to manually change the colors of these county polygons.

Here are the arguments.

Parameters:
  • collection_of_fips – can be a list, set, or other iterable of FIPS codes to visualize and label.

  • fig – the Figure onto which to draw this GeoAxes.

Return type:

GeoAxes

covid19_stats.engine.viz.display_fips_geom(fips_data, fig, **kwargs)

Demonstrative plot, returning a GeoAxes, of a FIPS data collection. For example, for the NYC Metro Area, this is,

../_images/viz_display_fips_geom_nyc.png

Fig. 4.7 Demonstration of this method showing the counties in the NYC Metro Area. One can extract the patches in this object to manually change the colors of these county polygons.

Here are the arguments.

Parameters:
Return type:

GeoAxes

covid19_stats.engine.viz.display_msa(msaname, fig, doShow=False, **kwargs)

Convenience method that visualizes and labels, by FIPS code, the counties in a Metropolitan Statistical Area. It can optionally save the output to a file, msa_<msaname>_counties.png. Here is an example of the NYC Metro Area.

../_images/viz_display_msa_nyc.png

Fig. 4.8 Display of the NYC Metro Area, with extra annotations beyond what display_fips can do.

Here are the arguments.

Parameters:
  • msaname (str) – the identifying name for the MSA, for example nyc.

  • fig – the Figure onto which to draw this GeoAxes.

  • doShow (bool) – if False, then just display the figure. If True, also save to a file, msa_<msaname>_counties.png. Default is False.

Return type:

GeoAxes

covid19_stats.engine.viz.get_summary_demo_data(inc_data, dirname='/usr/WS2/islam5/covid19_stats/docsrc', store_data=True)

This is the back-end method for show mode for covid19_create_movie_or_summary, and state show mode for covid19_state_summary. This creates six or seven files for a given geographical region. Given an input inc_data dict, it produces six files by default. Here prefix is the value of inc_data['prefix'] (for example nyc for the NYC metro area).

  • covid19_<prefix>_cases_LATEST.pdf and covid19_<prefix>_cases_LATEST.png: a PDF and PNG plot of the latest cumulative COVID-19 cases for the geographical region.

  • covid19_<prefix>_death_LATEST.pdf and covid19_<prefix>_death_LATEST.png: a PDF and PNG plot of the latest cumulative COVID-19 deaths for the geographical region.

  • covid19_<prefix>_cds_LATEST.pdf and covid19_<prefix>_cds_LATEST.png: a PDF and PNG plot of the latest cumulative COVID-19 case and death trend lines for the geographical region.

Optionally, one can choose to dump out a serialized Pandas DataFrame of the COVID-19 cases and deaths, total and per county, from the date of first incident to the latest incident. Its file name is covid19_<prefix>_LATEST.pkl.gz.

Table 4.3 displays the latest output for the NYC metro area.

Table 4.3 Latest plots of cumulative COVID-19 cases, deaths, and trend lines for the NYC metro area

covid19_nyc_cases_latest

covid19_nyc_death_latest

covid19_nyc_cds_latest

NYC metro area, plot of latest COVID-19 cumulative cases

NYC metro area, plot of latest COVID-19 cumulative deaths

NYC metro area, plot of latest trend lines of COVID-19 cumulative cases and deaths

Here are the arguments.

Parameters:
  • inc_data (dict) – the data for incidence of COVID-19 cases and deaths for a given geographical region. See get_incident_data for the format of the output data.

  • dirname (str) – the directory into which to save the six or seven files. The default is the current working directory.

  • store_data (bool) – if True, then create the serialized Pandas DataFrame of the COVID-19 cases and deaths, total and per county, from the date of first incident to the latest incident. Default is True.

covid19_stats.engine.viz.my_colorbar(mappable, ax, **kwargs)

secret saucing (explanation is incomprehensible) from https://joseph-long.com/writing/colorbars. I do not understand how it works the way it does, but it does! I shamelessly copy the method description from the colorbar method. I have also updated this thing to this website that now works on GeoAxes.

Parameters:
  • mappable – a ScalarMappable described by this colorbar.

  • ax – the parent Axes from whose space a new colorbar axes will be stolen.

Returns:

the underlying Colorbar.

Return type:

Colorbar

covid19_stats.engine.viz.plot_cases_deaths_region(inc_data, regionName, ax, days_from_beginning=0, doTitle=True, legend_text_scaling=1.0, aspect_ratio_mult=1.0)

Plots trend lines of cumulative COVID-19 cases and deaths for a region. It is easier to show rather than tell. Fig. 4.9 depicts trend lines of cumulative COVID-19 cases and deaths for the NYC metro area, 150 days after this metro’s first COVID-19 incident.

../_images/viz_plot_cases_deaths_region_nyc.png

Fig. 4.9 Plot of cumulative COVID-19 cases and deaths for the NYC metro area, 150 days after its first incident. Plot scaling is logarithmic, and dots accentuate the state of the cumulative cases and deaths 150 days after first incident. We have chosen to display the title.

Here are the arguments.

Parameters:
  • inc_data (dict) – the data for incidence of COVID-19 cases and deaths for a given geographical region. See get_incident_data for the format of the output data.

  • regionName (str) – the name of the region to display in title plots. For example, in Fig. 4.10, this is NYC Metro Area.

  • ax – the Axes onto which to make this plot.

  • days_from_beginning (int) – days after first incident of COVID-19 in this region. Must be \(\ge 0\).

  • doTitle (bool) – if True, then display the title over the plot. Default is True.

  • legend_text_scaling (float) – sometimes the legend text for the cumulative COVID-19 cases and deaths is too large. This is a multiplier on that text’s font size. Default is 1.0, but must be \(> 0\).

  • aspect_ratio_mult (float) – in the quad plots created in create_plots_daysfrombeginning or in create_summary_movie_frombeginning, without modification the Axes may look too squashed and inconsistent with the three other Axes or GeoAxes. This acts as a multiplier on the aspect ratio so that this Axes does not look out of place. Default is 1.0, but must be \(> 0\).

covid19_stats.engine.viz.plot_cases_or_deaths_bycounty(inc_data, regionName, fig, type_disp='cases', days_from_beginning=0, doTitle=True, plot_artists={}, poly_line_width=1.0, legend_text_scaling=1.0, doSmarter=False, rows=1, cols=1, num=1)

The lower-level function that displays the status of COVID-19 cases or deaths given an incidident data dict, inc_data. It displays the status of cumulative COVID-19 cases or deaths, a specific number of days from the beginning, coloring the counties in that region according to the legend maximum, and places the resulting GeoAxes at a specific location in a Figure grid of Axes or GeoAxes.

Instead of returning a GeoAxes, this initializes a dict of matplotlib objects, plot_artists. In this way, subsequent plots, e.g. for different days after the beginnning, do not have to perform the relatively costly operation of recreating the GeoAxes and fully painting in the Polygon patches; instead, these Polygon patches are re-colored and necessary Text artists’ strings are changed.

This dict, plot_artists, has the following keys,

  • axes: when initialized, the GeoAxes that consists of all counties, with COVID-19 cases or deaths, to display.

  • sm: the ScalarMappable describing the coloration by value for each county.

Furthermore, it is easier to show rather than tell. Fig. 4.10 depicts both cumulative COVID-19 cases and deaths for the NYC metro area, 150 days after this metro’s first COVID-19 incident.

../_images/viz_plot_cases_or_deaths_bycounty_nyc.png

Fig. 4.10 On the left, is the COVID-19 cumulative cases, and on the right, is the COVID-19 cumulative deaths, for the NYC metro area, 150 days after its first COVID-19 incident. The color limits for cases (left) is \(1.7\times 10^6\), while the color limits for death (right) is \(5.6\times 10^4\). We have chosen to display the titles over both plots. Color scaling is logarithmic.

Here are the arguments.

Parameters:
  • inc_data (dict) – the data for incidence of COVID-19 cases and deaths for a given geographical region. See get_incident_data for the format of the output data.

  • regionName (str) – the name of the region to display in title plots. For example, in Fig. 4.10, this is NYC Metro Area.

  • fig – the Figure onto which to create a GeoAxes (stored into the plot_artists dict) containing geographic features. Last three arguments – rows, cols, and num – describe the relative placement of the created GeoAxes. See add_subplot for those three arguments’ meanings.

  • type_disp (str) – if cases, then show cumulative COVID-19 cases. If deaths, then show cumulative COVID-19 deaths. Can only be cases or deaths.

  • days_from_beginning (int) – days after first incident of COVID-19 in this region. Must be \(\ge 0\).

  • doTitle (bool) – if True, then display the title over the plot. Default is True.

  • plot_artists (dict) – this contains the essential plotting objects for quicker re-display when plotting different days. Look at this description.

  • poly_line_width (float) – the line width of the counties to draw in the plot.

  • legend_text_scaling (float) – sometimes the text annotations showing the date, number of incident days, and cumulative deaths or cases is too large. This is a multiplier on that text’s font size. Default is 1.0, but must be \(> 0\).

  • doSmarter (bool) – if False, then make a plot tailored for small regions (relative to the size of the earth), such as states or MSAs. If True, then make a plot tailored for large regions such as the CONUS. Default is False.

  • rows (int) – the number of rows for axes in the Figure grid. Must be \(\ge 1\), and by default is 1.

  • cols (int) – the number of columns for axes in the Figure grid. Must be \(\ge 1\), and by default is 1.

  • num (int) – the plot number of the GeoAxes in this Figure grid. Must be \(\ge 1\) and \(\le\)rows times columns. Its default is 1. Look at add_subplot for its meaning.