Summary and comparison of memory usage of dataframe columns — inspect_mem • inspectdf

For a single dataframe, summarise the memory usage in each column. If two dataframes are supplied, compare memory usage for columns appearing in both dataframes. For grouped dataframes, summarise the memory usage separately for each group.

inspect_mem(df1, df2 = NULL)

Arguments

df1: A data frame.
df2: An optional second data frame with which to comparing memory usage. Defaults to NULL.

Value

A tibble summarising and comparing the columnwise memory usage for one or a pair of data frames.

Details

For a single dataframe, the tibble returned contains the columns:

col_name, a character vector containing column names of df1.
bytes, integer vector containing the number of bytes in each column of df1.
size, a character vector containing display-friendly memory usage of each column.
pcnt, the percentage of the dataframe's total memory footprint used by each column.

For a pair of dataframes, the tibble returned contains the columns:

col_name, a character vector containing column names of df1 and df2.
size_1, size_2, a character vector containing memory usage of each column in each of df1 and df2.
pcnt_1, pcnt_2, the percentage of total memory usage of each column within each of df1 and df2.

For a grouped dataframe, the tibble returned is as for a single dataframe, but where the first k columns are the grouping columns. There will be as many rows in the result as there are unique combinations of the grouping variables.

See also

Author

Alastair Rushworth

Examples

# Load dplyr for starwars data & pipe
library(dplyr)

# Single dataframe summary
inspect_mem(starwars)
#> # A tibble: 14 × 4
#>    col_name   bytes size        pcnt
#>    <chr>      <int> <chr>      <dbl>
#>  1 films      20008 19.54 Kb  35.9  
#>  2 starships   7448 7.27 Kb   13.4  
#>  3 name        6280 6.13 Kb   11.3  
#>  4 vehicles    5944 5.8 Kb    10.7  
#>  5 homeworld   3608 3.52 Kb    6.48 
#>  6 species     2952 2.88 Kb    5.30 
#>  7 skin_color  2656 2.59 Kb    4.77 
#>  8 eye_color   1608 1.57 Kb    2.89 
#>  9 hair_color  1440 1.41 Kb    2.59 
#> 10 sex          976 976 bytes  1.75 
#> 11 gender       872 872 bytes  1.57 
#> 12 mass         744 744 bytes  1.34 
#> 13 birth_year   744 744 bytes  1.34 
#> 14 height       400 400 bytes  0.718

# Paired dataframe comparison
inspect_mem(starwars, starwars[1:20, ])
#> # A tibble: 14 × 5
#>    col_name   size_1    size_2    pcnt_1 pcnt_2
#>    <chr>      <chr>     <chr>      <dbl>  <dbl>
#>  1 films      19.54 Kb  7.23 Kb   35.9   40.5  
#>  2 starships  7.27 Kb   2.58 Kb   13.4   14.4  
#>  3 name       6.13 Kb   1.49 Kb   11.3    8.35 
#>  4 vehicles   5.8 Kb    1.74 Kb   10.7    9.75 
#>  5 homeworld  3.52 Kb   816 bytes  6.48   4.46 
#>  6 species    2.88 Kb   552 bytes  5.30   3.02 
#>  7 skin_color 2.59 Kb   808 bytes  4.77   4.41 
#>  8 eye_color  1.57 Kb   664 bytes  2.89   3.63 
#>  9 hair_color 1.41 Kb   736 bytes  2.59   4.02 
#> 10 sex        976 bytes 440 bytes  1.75   2.40 
#> 11 gender     872 bytes 336 bytes  1.57   1.84 
#> 12 mass       744 bytes 208 bytes  1.34   1.14 
#> 13 birth_year 744 bytes 208 bytes  1.34   1.14 
#> 14 height     400 bytes 176 bytes  0.718  0.962

# Grouped dataframe summary
starwars %>% group_by(gender) %>% inspect_mem()
#> # A tibble: 39 × 5
#> # Groups:   gender [3]
#>    gender   col_name   bytes size       pcnt
#>    <chr>    <chr>      <int> <chr>     <dbl>
#>  1 feminine films       3816 3.73 Kb   33.0 
#>  2 feminine starships   1256 1.23 Kb   10.9 
#>  3 feminine name        1248 1.22 Kb   10.8 
#>  4 feminine vehicles    1176 1.15 Kb   10.2 
#>  5 feminine homeworld    776 776 bytes  6.71
#>  6 feminine skin_color   744 744 bytes  6.43
#>  7 feminine species      664 664 bytes  5.74
#>  8 feminine eye_color    528 528 bytes  4.56
#>  9 feminine hair_color   520 520 bytes  4.50
#> 10 feminine sex          296 296 bytes  2.56
#> # … with 29 more rows
#> # ℹ Use `print(n = ...)` to see more rows