FE581Topics: R [Python and R for the ModernData Scientist]

Users often forget that Jupyter is shortfor “Julia, Python, and R” because it’s very Python-centric.

One last note: Python users refer to themselves as Pythonistias, which is a really coolname! There’s no real equivalent in R, and they also don’t get a really cool animal, butthat’s life when you’re a single-letter language. R users are typically called…wait forit…useRs! (Exclamation optional.) Indeed, the official annual conference is calleduseR! (exclamation obligatory), and the publisher Springer has an ongoing and veryexcellent series of books of the same name.

A python:R Bilingual Dictionary

View online with AzatAI Datalore Server

Package Management

Installing a single package

Installing specific package versions

Installing multiple packages

Loading packages

Assign Operators

Fucking R has so much ugly assignments.

Types

The four most common user-defined atomic-vector types in R:

TypeData frame shorthandTibble shorthandDescriptionExample
Logicallogi<lgl>Binary dataTRUE/FALSE, T/F, 1/0
Integerint<int>Whole numbers from , 7, 9 , 2, -4
Doublenum<dbl>Real numbers from , 3.14, 2.78, 6.45
Characterchr<chr>All alpha-numeric characters, includingwhite spaces“Apple,” “Dog”

The four most common user-defined types in Python:

TypeShorthandDescriptionExample
BooleanboolBinary DataTrue/False
IntegerintWhole numbers from , 7, 9 , 2, -4
FloatfloatReal numbers from , 3.14, 2.78, 6.45
StringstrAll alpha-numeric characters, including white spaces“Apple,” “Dog”

Arithmetic Operators

Common arithmetic operators:

DescriptionR OperatorPython Operator
Addition++
Substraction--
Multiplication**
Division (float)//
Exponentiation^ or ****
Integer Division (floor)%/%//
Modulus%%%

Attributes

Class attributes:

Keywords

Reserved words and keywords:

Reserved words or keywords means you can not use them to name ur var.

Functions and Methods

40

600

300

x: 40 y: 100

 

Style and Naming Conventions

Style in R is generally more loosely defined than in Python. Nonetheless, see the Advanced R style guide by Hadley Wickham (CRC Press) or Google’s R Style guide forsuggestions.

For Python, see the PEP 8 style guide.

Analogous Data Storage Objects

Analogous Python objects for common R objects:

R StructurePython Analogous Structure
Vector (one-dimensional homogeneous)ndarray, bnut also scalars, homogenous list and tuple
Vector, matrix or arrayNumPy n-dimensional array (ndarray)
Unnamed list (heterogenous)list
Named list (heterogenous)Dictionary dict, but lacking order
Environment (named, but unordered elements)Dictionary, dict
Variable/column in a data.framePandas Series (pd.Series)
Two-dimensional data.framePandas data frame (pd.DataFrame)

Analogous R objects for common Python objects:

Python StructureR Analogous Structure
scalarOne-element long vector
list( homo)Vector, but as if lacking vectorization
list (hetero)Unnamed list
tuple immutableVector, list as separated output from a function
Dictionary, dict, a key-value pairNamed list or better environment
NumPy n-dimensional array (ndarray)Vector, matrix, or array
Pandas SeriesVector, variable/column in a data.frame
Pandas Data FrameTwo-dimensional data.frame

'Istanbul''Urumqi''Almaty' 584 1054 653

['Istanbul', 'Berlin', 'Korla']

[584, 1054, 653]

One-dimensional, heterogeneous key-value pairs (Lists in R,dictionaries in Python):

2023-02-27 at 16.00.18

2023-02-27 at 16.01.10

2023-02-27 at 16.01.59

A data.frame: 1 × 3

distpopcountry
584143275DE

A data.frame: 1 × 3

distpopcountry
84275KZ

13 'coefficients''residuals''effects''rank''fitted.values''assign''qr''df.residual''contrasts''xlevels''call''terms''model'

2023-02-27 at 16.07.44

Data Frames

Data Frames in Python:

2023-02-27 at 16.10.20

['city', 'dist', 'pop', 'area', 'country']

[['Munich', 'Paris', 'Amsterdam'], [584, 1054, 653], [1484226, 2175601, 1558755], [310.43, 105.4, 219.32], ['DE', 'FR', 'NL']]

[(['Munich', 'Paris', 'Amsterdam'], 'city'), ([584, 1054, 653], 'dist'), ([1484226, 2175601, 1558755], 'pop'), ([310.43, 105.4, 219.32], 'area'), (['DE', 'FR', 'NL'], 'country')]

2023-02-27 at 16.16.37

2023-02-27 at 16.16.54

Two-dimensional, heterogenous, tabular data frames in R:

2023-02-27 at 16.18.31

Multidimensional arrays:

R:

2023-02-27 at 16.22.36

 

 

2023-02-27 at 16.22.53

444 666

2023-02-27 at 16.23.40

2023-02-27 at 16.23.59

Python:

2023-02-27 at 16.25.15

2023-02-27 at 16.27.00

Logical Expressions

Relational operators

DescriptionR OperatorPython Operator
Equivalency====
Non-equivalency!=!=
Greater-than (or equal to)> (>=)> (>=)
Lesser-than (or equal to)< (<=)< (<=)
Negation!xnot()

2023-02-28 at 11.58.43

Python:

Logical operators

DescriptionR operatorPython Operator
AND&, && &, and
OR|,|||, or
WITHINy %in% xIn, not in
identityidentical()is, is not

 

R for Pythonistas