Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 16650 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 3746 |
| Duplicate rows (%) | 22.5% |
| Total size in memory | 1.9 MiB |
| Average record size in memory | 121.0 B |
Variable types
| Categorical | 9 |
|---|---|
| Boolean | 3 |
| Numeric | 4 |
possui_celular has constant value "1" | Constant |
| Dataset has 3746 (22.5%) duplicate rows | Duplicates |
qtd_filhos is highly correlated with qt_pessoas_residencia | High correlation |
qt_pessoas_residencia is highly correlated with qtd_filhos | High correlation |
qtd_filhos is highly correlated with qt_pessoas_residencia | High correlation |
idade is highly correlated with tempo_emprego | High correlation |
tempo_emprego is highly correlated with idade | High correlation |
qt_pessoas_residencia is highly correlated with qtd_filhos | High correlation |
qtd_filhos is highly correlated with qt_pessoas_residencia | High correlation |
qt_pessoas_residencia is highly correlated with qtd_filhos | High correlation |
possui_celular is highly correlated with educacao and 10 other fields | High correlation |
educacao is highly correlated with possui_celular | High correlation |
mau is highly correlated with possui_celular | High correlation |
posse_de_imovel is highly correlated with possui_celular | High correlation |
possui_fone_comercial is highly correlated with possui_celular | High correlation |
estado_civil is highly correlated with possui_celular | High correlation |
possui_email is highly correlated with possui_celular | High correlation |
sexo is highly correlated with possui_celular | High correlation |
posse_de_veiculo is highly correlated with possui_celular | High correlation |
tipo_residencia is highly correlated with possui_celular | High correlation |
possui_fone is highly correlated with possui_celular | High correlation |
tipo_renda is highly correlated with possui_celular | High correlation |
sexo is highly correlated with posse_de_veiculo | High correlation |
posse_de_veiculo is highly correlated with sexo | High correlation |
qtd_filhos is highly correlated with qt_pessoas_residencia | High correlation |
tipo_renda is highly correlated with idade | High correlation |
idade is highly correlated with tipo_renda | High correlation |
qt_pessoas_residencia is highly correlated with qtd_filhos | High correlation |
qtd_filhos has 11486 (69.0%) zeros | Zeros |
Reproduction
| Analysis started | 2021-10-01 01:39:24.177542 |
|---|---|
| Analysis finished | 2021-10-01 01:39:32.941231 |
| Duration | 8.76 seconds |
| Software version | pandas-profiling v3.1.0 |
| Download configuration | config.json |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 130.2 KiB |
| F | |
|---|---|
| M |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | M |
|---|---|
| 2nd row | F |
| 3rd row | F |
| 4th row | M |
| 5th row | F |
Common Values
| Value | Count | Frequency (%) |
| F | 11201 | |
| M | 5449 |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| f | 11201 | |
| m | 5449 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 16.4 KiB |
| False | |
|---|---|
| True |
| Value | Count | Frequency (%) |
| False | 10178 | |
| True | 6472 |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 16.4 KiB |
| True | |
|---|---|
| False |
| Value | Count | Frequency (%) |
| True | 11176 | |
| False | 5474 |
qtd_filhos
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONZEROS| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.4331531532 |
| Minimum | 0 |
|---|---|
| Maximum | 14 |
| Zeros | 11486 |
| Zeros (%) | 69.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 130.2 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 2 |
| Maximum | 14 |
| Range | 14 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.7393953444 |
|---|---|
| Coefficient of variation (CV) | 1.707006723 |
| Kurtosis | 16.00543616 |
| Mean | 0.4331531532 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.331772945 |
| Sum | 7212 |
| Variance | 0.5467054754 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=8)
| Value | Count | Frequency (%) |
| 0 | 11486 | |
| 1 | 3393 | 20.4% |
| 2 | 1552 | 9.3% |
| 3 | 189 | 1.1% |
| 4 | 24 | 0.1% |
| 14 | 2 | < 0.1% |
| 7 | 2 | < 0.1% |
| 5 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 0 | 11486 | |
| 1 | 3393 | 20.4% |
| 2 | 1552 | 9.3% |
| 3 | 189 | 1.1% |
| 4 | 24 | 0.1% |
| 5 | 2 | < 0.1% |
| 7 | 2 | < 0.1% |
| 14 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 14 | 2 | < 0.1% |
| 7 | 2 | < 0.1% |
| 5 | 2 | < 0.1% |
| 4 | 24 | 0.1% |
| 3 | 189 | 1.1% |
| 2 | 1552 | 9.3% |
| 1 | 3393 | 20.4% |
| 0 | 11486 |
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 130.2 KiB |
| Working | |
|---|---|
| Commercial associate | |
| Pensioner | |
| State servant | |
| Student | 8 |
Length
| Max length | 20 |
|---|---|
| Median length | 7 |
| Mean length | 10.84648649 |
| Min length | 7 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Working |
|---|---|
| 2nd row | Commercial associate |
| 3rd row | Commercial associate |
| 4th row | Working |
| 5th row | Working |
Common Values
| Value | Count | Frequency (%) |
| Working | 8565 | |
| Commercial associate | 3826 | |
| Pensioner | 2800 | 16.8% |
| State servant | 1451 | 8.7% |
| Student | 8 | < 0.1% |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| working | 8565 | |
| associate | 3826 | |
| commercial | 3826 | |
| pensioner | 2800 | 12.8% |
| servant | 1451 | 6.6% |
| state | 1451 | 6.6% |
| student | 8 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 130.2 KiB |
| Secondary / secondary special | |
|---|---|
| Higher education | |
| Incomplete higher | 649 |
| Lower secondary | 188 |
| Academic degree | 17 |
Length
| Max length | 29 |
|---|---|
| Median length | 29 |
| Mean length | 24.80654655 |
| Min length | 15 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Secondary / secondary special |
|---|---|
| 2nd row | Secondary / secondary special |
| 3rd row | Secondary / secondary special |
| 4th row | Higher education |
| 5th row | Incomplete higher |
Common Values
| Value | Count | Frequency (%) |
| Secondary / secondary special | 11245 | |
| Higher education | 4551 | |
| Incomplete higher | 649 | 3.9% |
| Lower secondary | 188 | 1.1% |
| Academic degree | 17 | 0.1% |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| secondary | 22678 | |
| special | 11245 | |
| 11245 | ||
| higher | 5200 | 9.3% |
| education | 4551 | 8.2% |
| incomplete | 649 | 1.2% |
| lower | 188 | 0.3% |
| degree | 17 | < 0.1% |
| academic | 17 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 130.2 KiB |
| Married | |
|---|---|
| Single / not married | |
| Civil marriage | |
| Separated | 945 |
| Widow | 707 |
Length
| Max length | 20 |
|---|---|
| Median length | 7 |
| Mean length | 9.156876877 |
| Min length | 5 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Married |
|---|---|
| 2nd row | Single / not married |
| 3rd row | Single / not married |
| 4th row | Married |
| 5th row | Married |
Common Values
| Value | Count | Frequency (%) |
| Married | 11680 | |
| Single / not married | 2035 | 12.2% |
| Civil marriage | 1283 | 7.7% |
| Separated | 945 | 5.7% |
| Widow | 707 | 4.2% |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| married | 13715 | |
| not | 2035 | 8.5% |
| 2035 | 8.5% | |
| single | 2035 | 8.5% |
| marriage | 1283 | 5.3% |
| civil | 1283 | 5.3% |
| separated | 945 | 3.9% |
| widow | 707 | 2.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 130.2 KiB |
| House / apartment | |
|---|---|
| With parents | 738 |
| Municipal apartment | 520 |
| Rented apartment | 227 |
| Office apartment | 120 |
Length
| Max length | 19 |
|---|---|
| Median length | 17 |
| Mean length | 16.81147147 |
| Min length | 12 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | House / apartment |
|---|---|
| 2nd row | House / apartment |
| 3rd row | House / apartment |
| 4th row | House / apartment |
| 5th row | House / apartment |
Common Values
| Value | Count | Frequency (%) |
| House / apartment | 14974 | |
| With parents | 738 | 4.4% |
| Municipal apartment | 520 | 3.1% |
| Rented apartment | 227 | 1.4% |
| Office apartment | 120 | 0.7% |
| Co-op apartment | 71 | 0.4% |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| apartment | 15912 | |
| 14974 | ||
| house | 14974 | |
| parents | 738 | 1.5% |
| with | 738 | 1.5% |
| municipal | 520 | 1.1% |
| rented | 227 | 0.5% |
| office | 120 | 0.2% |
| co-op | 71 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 5298 |
|---|---|
| Distinct (%) | 31.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 44.31951277 |
| Minimum | 22.03013699 |
|---|---|
| Maximum | 68.90958904 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 130.2 KiB |
Quantile statistics
| Minimum | 22.03013699 |
|---|---|
| 5-th percentile | 27.69315068 |
| Q1 | 34.8739726 |
| median | 43.49315068 |
| Q3 | 53.4109589 |
| 95-th percentile | 63.14780822 |
| Maximum | 68.90958904 |
| Range | 46.87945205 |
| Interquartile range (IQR) | 18.5369863 |
Descriptive statistics
| Standard deviation | 11.2288368 |
|---|---|
| Coefficient of variation (CV) | 0.2533610163 |
| Kurtosis | -1.032189251 |
| Mean | 44.31951277 |
| Median Absolute Deviation (MAD) | 9.250684932 |
| Skewness | 0.1792052356 |
| Sum | 737919.8877 |
| Variance | 126.0867759 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 34.72876712 | 22 | 0.1% |
| 40.18356164 | 22 | 0.1% |
| 41.47945205 | 21 | 0.1% |
| 45.93972603 | 20 | 0.1% |
| 37.04109589 | 20 | 0.1% |
| 57.87945205 | 19 | 0.1% |
| 42.94520548 | 18 | 0.1% |
| 34.2 | 18 | 0.1% |
| 27.88219178 | 17 | 0.1% |
| 43.39452055 | 17 | 0.1% |
| Other values (5288) | 16456 |
| Value | Count | Frequency (%) |
| 22.03013699 | 1 | < 0.1% |
| 22.07123288 | 1 | < 0.1% |
| 22.22191781 | 1 | < 0.1% |
| 22.41643836 | 1 | < 0.1% |
| 22.56986301 | 2 | |
| 22.66849315 | 1 | < 0.1% |
| 22.86849315 | 3 | |
| 22.88493151 | 1 | < 0.1% |
| 23.00273973 | 3 | |
| 23.12876712 | 2 |
| Value | Count | Frequency (%) |
| 68.90958904 | 2 | < 0.1% |
| 68.76438356 | 1 | < 0.1% |
| 68.52054795 | 1 | < 0.1% |
| 68.4109589 | 1 | < 0.1% |
| 68.34520548 | 2 | < 0.1% |
| 68.30684932 | 1 | < 0.1% |
| 68.25753425 | 2 | < 0.1% |
| 68.0630137 | 2 | < 0.1% |
| 68.03013699 | 1 | < 0.1% |
| 68.00273973 | 6 |
| Distinct | 3005 |
|---|---|
| Distinct (%) | 18.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -161.4164463 |
| Minimum | -1000.665753 |
|---|---|
| Maximum | 42.90684932 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 2793 |
| Negative (%) | 16.8% |
| Memory size | 130.2 KiB |
Quantile statistics
| Minimum | -1000.665753 |
|---|---|
| 5-th percentile | -1000.665753 |
| Q1 | 1.183561644 |
| median | 4.691780822 |
| Q3 | 9.088356164 |
| 95-th percentile | 20.19452055 |
| Maximum | 42.90684932 |
| Range | 1043.572603 |
| Interquartile range (IQR) | 7.904794521 |
Descriptive statistics
| Standard deviation | 376.8439122 |
|---|---|
| Coefficient of variation (CV) | -2.334606671 |
| Kurtosis | 1.16175938 |
| Mean | -161.4164463 |
| Median Absolute Deviation (MAD) | 3.831506849 |
| Skewness | -1.777557843 |
| Sum | -2687583.83 |
| Variance | 142011.3342 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -1000.665753 | 2793 | 16.8% |
| 4.216438356 | 41 | 0.2% |
| 4.797260274 | 29 | 0.2% |
| 6.934246575 | 28 | 0.2% |
| 5.216438356 | 28 | 0.2% |
| 0.5479452055 | 28 | 0.2% |
| 0.295890411 | 26 | 0.2% |
| 1.260273973 | 26 | 0.2% |
| 3.934246575 | 26 | 0.2% |
| 3.602739726 | 25 | 0.2% |
| Other values (2995) | 13600 |
| Value | Count | Frequency (%) |
| -1000.665753 | 2793 | |
| 0.1178082192 | 1 | < 0.1% |
| 0.1780821918 | 1 | < 0.1% |
| 0.1917808219 | 1 | < 0.1% |
| 0.2 | 9 | 0.1% |
| 0.2164383562 | 1 | < 0.1% |
| 0.2410958904 | 1 | < 0.1% |
| 0.2438356164 | 4 | < 0.1% |
| 0.2493150685 | 3 | < 0.1% |
| 0.2520547945 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 42.90684932 | 2 | < 0.1% |
| 41.2 | 12 | |
| 40.78630137 | 3 | < 0.1% |
| 40.57534247 | 6 | |
| 40.47945205 | 1 | < 0.1% |
| 39.82465753 | 4 | < 0.1% |
| 39.65205479 | 4 | < 0.1% |
| 39.48767123 | 3 | < 0.1% |
| 39.28219178 | 1 | < 0.1% |
| 38.70410959 | 1 | < 0.1% |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 130.2 KiB |
| 1 |
|---|
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 16650 |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| 1 | 16650 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 130.2 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 1 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 12900 | |
| 1 | 3750 | 22.5% |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| 0 | 12900 | |
| 1 | 3750 | 22.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 130.2 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 11727 | |
| 1 | 4923 |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| 0 | 11727 | |
| 1 | 4923 |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 130.2 KiB |
| 0 | |
|---|---|
| 1 | 1480 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 0 |
|---|---|
| Distinct characters | 0 |
| Distinct categories | 0 ? |
| Distinct scripts | 0 ? |
| Distinct blocks | 0 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 15170 | |
| 1 | 1480 | 8.9% |
Length
Histogram of lengths of the category
Pie chart
| Value | Count | Frequency (%) |
| 0 | 15170 | |
| 1 | 1480 | 8.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| No values found. | ||
Most occurring categories
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per category
Most occurring scripts
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per script
Most occurring blocks
| Value | Count | Frequency (%) |
| No values found. | ||
Most frequent character per block
qt_pessoas_residencia
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATION| Distinct | 9 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.211891892 |
| Minimum | 1 |
|---|---|
| Maximum | 15 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 130.2 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 2 |
| Q3 | 3 |
| 95-th percentile | 4 |
| Maximum | 15 |
| Range | 14 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 0.9037546698 |
|---|---|
| Coefficient of variation (CV) | 0.4085889881 |
| Kurtosis | 5.750133465 |
| Mean | 2.211891892 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 1.191339607 |
| Sum | 36828 |
| Variance | 0.8167725032 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=9)
| Value | Count | Frequency (%) |
| 2 | 9042 | |
| 1 | 3022 | 18.2% |
| 3 | 2887 | 17.3% |
| 4 | 1489 | 8.9% |
| 5 | 180 | 1.1% |
| 6 | 25 | 0.2% |
| 15 | 2 | < 0.1% |
| 9 | 2 | < 0.1% |
| 7 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 1 | 3022 | 18.2% |
| 2 | 9042 | |
| 3 | 2887 | 17.3% |
| 4 | 1489 | 8.9% |
| 5 | 180 | 1.1% |
| 6 | 25 | 0.2% |
| 7 | 1 | < 0.1% |
| 9 | 2 | < 0.1% |
| 15 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 15 | 2 | < 0.1% |
| 9 | 2 | < 0.1% |
| 7 | 1 | < 0.1% |
| 6 | 25 | 0.2% |
| 5 | 180 | 1.1% |
| 4 | 1489 | 8.9% |
| 3 | 2887 | 17.3% |
| 2 | 9042 | |
| 1 | 3022 | 18.2% |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| sexo | posse_de_veiculo | posse_de_imovel | qtd_filhos | tipo_renda | educacao | estado_civil | tipo_residencia | idade | tempo_emprego | possui_celular | possui_fone_comercial | possui_fone | possui_email | qt_pessoas_residencia | mau | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | M | Y | Y | 0 | Working | Secondary / secondary special | Married | House / apartment | 58.832877 | 3.106849 | 1 | 0 | 0 | 0 | 2.0 | False |
| 1 | F | N | Y | 0 | Commercial associate | Secondary / secondary special | Single / not married | House / apartment | 52.356164 | 8.358904 | 1 | 0 | 1 | 1 | 1.0 | False |
| 2 | F | N | Y | 0 | Commercial associate | Secondary / secondary special | Single / not married | House / apartment | 52.356164 | 8.358904 | 1 | 0 | 1 | 1 | 1.0 | False |
| 3 | M | Y | Y | 0 | Working | Higher education | Married | House / apartment | 46.224658 | 2.106849 | 1 | 1 | 1 | 1 | 2.0 | False |
| 4 | F | Y | N | 0 | Working | Incomplete higher | Married | House / apartment | 29.230137 | 3.021918 | 1 | 0 | 0 | 0 | 2.0 | False |
| 5 | F | Y | N | 0 | Working | Incomplete higher | Married | House / apartment | 29.230137 | 3.021918 | 1 | 0 | 0 | 0 | 2.0 | False |
| 6 | F | N | Y | 0 | Working | Secondary / secondary special | Married | House / apartment | 27.482192 | 4.024658 | 1 | 0 | 1 | 0 | 2.0 | False |
| 7 | F | N | Y | 0 | Working | Secondary / secondary special | Married | House / apartment | 27.482192 | 4.024658 | 1 | 0 | 1 | 0 | 2.0 | False |
| 8 | F | N | Y | 1 | Working | Secondary / secondary special | Single / not married | House / apartment | 30.049315 | 4.438356 | 1 | 0 | 0 | 0 | 2.0 | False |
| 9 | F | N | Y | 1 | Working | Secondary / secondary special | Single / not married | House / apartment | 30.049315 | 4.438356 | 1 | 0 | 0 | 0 | 2.0 | False |
Last rows
| sexo | posse_de_veiculo | posse_de_imovel | qtd_filhos | tipo_renda | educacao | estado_civil | tipo_residencia | idade | tempo_emprego | possui_celular | possui_fone_comercial | possui_fone | possui_email | qt_pessoas_residencia | mau | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 16640 | M | N | N | 1 | Working | Secondary / secondary special | Married | Municipal apartment | 35.468493 | 6.624658 | 1 | 0 | 0 | 1 | 3.0 | True |
| 16641 | M | Y | N | 0 | Working | Incomplete higher | Married | With parents | 24.997260 | 2.630137 | 1 | 1 | 0 | 0 | 2.0 | True |
| 16642 | M | Y | N | 0 | Working | Incomplete higher | Married | With parents | 24.997260 | 2.630137 | 1 | 1 | 0 | 0 | 2.0 | True |
| 16643 | F | N | Y | 0 | Pensioner | Secondary / secondary special | Married | House / apartment | 60.304110 | -1000.665753 | 1 | 0 | 0 | 0 | 2.0 | True |
| 16644 | F | N | Y | 1 | Working | Secondary / secondary special | Single / not married | House / apartment | 34.857534 | 3.101370 | 1 | 1 | 1 | 0 | 1.0 | True |
| 16645 | F | N | Y | 0 | Working | Secondary / secondary special | Civil marriage | House / apartment | 54.109589 | 9.884932 | 1 | 0 | 0 | 0 | 2.0 | True |
| 16646 | F | N | Y | 0 | Commercial associate | Secondary / secondary special | Married | House / apartment | 43.389041 | 7.380822 | 1 | 1 | 1 | 0 | 2.0 | True |
| 16647 | M | Y | Y | 0 | Working | Secondary / secondary special | Married | House / apartment | 30.005479 | 9.800000 | 1 | 1 | 0 | 0 | 2.0 | True |
| 16648 | M | Y | Y | 0 | Working | Secondary / secondary special | Married | House / apartment | 30.005479 | 9.800000 | 1 | 1 | 0 | 0 | 2.0 | True |
| 16649 | F | N | Y | 0 | Pensioner | Higher education | Married | House / apartment | 33.936986 | 3.630137 | 1 | 0 | 1 | 1 | 2.0 | True |
Most frequently occurring
| sexo | posse_de_veiculo | posse_de_imovel | qtd_filhos | tipo_renda | educacao | estado_civil | tipo_residencia | idade | tempo_emprego | possui_celular | possui_fone_comercial | possui_fone | possui_email | qt_pessoas_residencia | mau | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1304 | F | N | Y | 0 | Working | Secondary / secondary special | Married | House / apartment | 37.041096 | 15.035616 | 1 | 0 | 0 | 0 | 2.0 | False | 20 |
| 2652 | M | N | N | 2 | Working | Higher education | Civil marriage | House / apartment | 45.939726 | 8.460274 | 1 | 1 | 0 | 0 | 4.0 | False | 20 |
| 1048 | F | N | Y | 0 | Pensioner | Secondary / secondary special | Widow | House / apartment | 57.879452 | -1000.665753 | 1 | 0 | 0 | 0 | 1.0 | False | 19 |
| 2298 | F | Y | Y | 0 | Working | Secondary / secondary special | Married | House / apartment | 46.561644 | 19.901370 | 1 | 0 | 0 | 0 | 2.0 | False | 16 |
| 2691 | M | N | Y | 0 | Commercial associate | Secondary / secondary special | Married | House / apartment | 34.200000 | 11.065753 | 1 | 0 | 0 | 0 | 2.0 | False | 16 |
| 3699 | M | Y | Y | 2 | Working | Higher education | Married | House / apartment | 36.093151 | 6.786301 | 1 | 0 | 0 | 0 | 4.0 | False | 16 |
| 687 | F | N | Y | 0 | Commercial associate | Secondary / secondary special | Married | House / apartment | 53.375342 | 7.884932 | 1 | 0 | 0 | 0 | 2.0 | False | 14 |
| 731 | F | N | Y | 0 | Commercial associate | Secondary / secondary special | Single / not married | Rented apartment | 42.517808 | 8.860274 | 1 | 0 | 0 | 0 | 1.0 | False | 14 |
| 1974 | F | Y | N | 0 | Working | Secondary / secondary special | Married | House / apartment | 53.263014 | 11.134247 | 1 | 0 | 0 | 0 | 2.0 | False | 14 |
| 972 | F | N | Y | 0 | Pensioner | Secondary / secondary special | Married | House / apartment | 65.123288 | -1000.665753 | 1 | 0 | 0 | 0 | 2.0 | False | 13 |