Histogram

../../_images/histogram-header.png

What it is

The Histogram displays the distribution of data across a continuous variable, by aggregating the data into bins of a fixed size. Vertical bars are used to show the count of data within each bin, with taller bars indicating areas of density within the dataset. In MapD Immerse, You can use Histograms to count occurrences of data other than the binned dimension (shown in “How to set it up” below).

When to use it

Use a Histogram to understand the distribution of your data, and to see areas of unusually high or low density, which would be masked by a simple aggregate such as Average.

How to set it up

The animation below uses Twitter data to set up two histograms.

The first histogram in the animation shows the distribution of data for number of followers, indicating that a large number of people have fewer than 150 followers, followed by a diminishing “long tail” of people with more than that number.

The second histogram shown below forms bins based on one column (followers), but draws the vertical height of the bars based on count of a different column (the number of followees ). This allows us to see how the count of one column varies when viewed by groupings of another column. In this case, people with more followers also tend to follow more accounts themselves, up to the level of about 20,000 followers, at which point the relationship becomes more tenuous.

../../_images/histogram-create.gif

Once you have chosen your measure and dimension, you can edit the labels for the X and Y axes. Click the label and enter your custom text.

../../_images/histogram-edit-axes.png