Skip to content

reader

Reader functions for various file formats

read_dynamx(filepath_or_buffer, time_conversion=('min', 's'))

Reads DynamX .csv files and returns the resulting peptide table as a narwhals DataFrame.

Parameters:

Name Type Description Default
filepath_or_buffer Path | str | IO | bytes

File path of the .csv file or :class:~io.StringIO object.

required
time_conversion Optional[tuple[Literal['h', 'min', 's'], Literal['h', 'min', 's']]]

How to convert the time unit of the field 'exposure'. Format is ('', <'to'>). Unit options are 'h', 'min' or 's'.

('min', 's')

Returns:

Type Description
DataFrame

Peptide table as a narwhals DataFrame.

Source code in hdxms_datasets/reader.py
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
def read_dynamx(
    filepath_or_buffer: Path | str | IO | bytes,
    time_conversion: Optional[tuple[Literal["h", "min", "s"], Literal["h", "min", "s"]]] = (
        "min",
        "s",
    ),
) -> nw.DataFrame:
    """
    Reads DynamX .csv files and returns the resulting peptide table as a narwhals DataFrame.

    Args:
        filepath_or_buffer: File path of the .csv file or :class:`~io.StringIO` object.
        time_conversion: How to convert the time unit of the field 'exposure'. Format is ('<from>', <'to'>).
            Unit options are 'h', 'min' or 's'.

    Returns:
        Peptide table as a narwhals DataFrame.
    """

    df = read_csv(filepath_or_buffer)
    df = df.rename({col: col.replace(" ", "_").lower() for col in df.columns})

    # insert 'stop' column (which is end + 1)
    columns = df.columns
    columns.insert(columns.index("end") + 1, "stop")
    df = df.with_columns((nw.col("end") + 1).alias("stop")).select(columns)

    if time_conversion is not None:
        time_lut = {"h": 3600, "min": 60, "s": 1}
        time_factor = time_lut[time_conversion[0]] / time_lut[time_conversion[1]]
        df = df.with_columns((nw.col("exposure") * time_factor))

    return df