ops

Miscellaneous operations.

General usage

confirmed([prompt, confirmation_required, resp])

Prompt user for confirmation to proceed.

get_obj_attr(obj[, col_names, as_dataframe])

Retrieve main attributes of an object.

eval_dtype(str_val)

Convert a string representation to its intrinsic data type.

hash_password(password[, salt, salt_size, ...])

Hash a password using hashlib.pbkdf2_hmac (PBKDF2 algorithm with HMAC-SHA256).

verify_password(password, salt, key[, ...])

Verify if a password matches the provided salt and key.

func_running_time(func)

Decorator to measure the execution time of a function or class method.

Basic computation / conversion

gps_time_to_utc(gps_time)

Convert GPS time to UTC time.

parse_size(size[, binary, precision])

Parse size into human-readable format or vice versa.

get_number_of_chunks(file_or_obj[, ...])

Get the total number of chunks of a data file, given a minimum chunk size limit.

get_extreme_outlier_bounds(num_dat[, k])

Get the upper and lower bounds for extreme outliers using the interquartile range method.

interquartile_range(num_dat)

Calculate the interquartile range (IQR) of numerical data.

find_closest_date(date, lookup_dates[, ...])

Find the closest date to a given date from a list of dates.

Basic data manipulation

Iterable

loop_in_pairs(iterable)

Generate pairs of consecutive elements from the given iterable.

split_list_by_size(lst, sub_len)

Split a list into evenly sized sub-lists.

split_list(lst, num_of_sub)

Split a list into a specified number of equally-sized sub-lists.

split_iterable(iterable, chunk_size)

Split an iterable into evenly sized chunks.

update_dict(dictionary, updates[, inplace])

Update a (nested) dictionary with another dictionary.

update_dict_keys(dictionary[, replacements])

Update keys in a (nested) dictionary based on a given replacements dictionary.

get_dict_values(key, dictionary)

Retrieve all values in a (nested) dictionary for a given key.

remove_dict_keys(dictionary, *keys)

Remove multiple keys from a dictionary.

compare_dicts(dict1, dict2)

Compare the differences between two dictionaries.

merge_dicts(*dicts)

Merge multiple dictionaries into a single dictionary.

Tabular data

detect_nan_for_str_column(data_frame[, ...])

Detect if a column with string type contains NaN values for a given dataframe.

create_rotation_matrix(theta)

Create a 2D rotation matrix for counterclockwise rotation.

dict_to_dataframe(input_dict[, k, v])

Convert a dictionary to a dataframe.

parse_csr_matrix(path_to_csr[, verbose])

Load in a compressed sparse row (CSR) or compressed row storage (CRS).

swap_cols(array, c1, c2[, as_list])

Swap positions of two columns in an array.

swap_rows(array, r1, r2[, as_list])

Swap positions of two rows in an array.

np_shift(array, step[, fill_value])

Shift an array by a desired number of rows.

cmap_discretisation(cmap, n_colours)

Create a discrete colour ramp.

colour_bar_index(cmap, n_colours[, labels])

Create a colour bar with correctly aligned labels.

Graph plotting

cmap_discretisation(cmap, n_colours)

Create a discrete colour ramp.

colour_bar_index(cmap, n_colours[, labels])

Create a colour bar with correctly aligned labels.

Web data manipulation

is_network_connected()

Check whether the current machine is connected to the Internet.

is_url(url[, partially])

Check if url is a valid URL.

is_url_connectable(url)

Check if the current machine can connect to the given URL.

is_downloadable(url[, request_field])

Check if a URL leads to a webpage where downloadable content is available.

init_requests_session(url[, max_retries, ...])

Instantiate a requests session with configurable retry behaviour.

load_user_agent_strings([shuffled, ...])

Load user-agent strings for popular web browsers.

get_user_agent_string([fancy])

Get a random user-agent string for a specified browser.

fake_requests_headers([randomized])

Generate fake HTTP headers for requests.get.

download_file_from_url(url, path_to_file[, ...])

Download a file from a valid URL.

GitHubFileDownloader(repo_url[, ...])

Download files from GitHub repositories.