Read FCS files#

In this notebook, we load an fcs file into the anndata format, move the forward scatter (FCS) and sideward scatter (SSC) information to the .obs section of the anndata file and perform compensation on the data.

import readfcs
import pytometry as pm
/home/runner/work/pytometry/pytometry/.nox/build-3-9/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

Read data from readfcs package example. The fcs file was part of the following reference and originally deposited on the FlowRepository.

path_data = readfcs.datasets.Oetjen18_t1()
adata = pm.io.read_fcs(path_data)
adata
AnnData object with n_obs × n_vars = 241552 × 20
    var: 'n', 'channel', 'marker', '$PnR', '$PnB', '$PnE', '$PnV', '$PnG'
    uns: 'meta'

The .var section of the AnnData object contains the channel information. We set the marker names as var_names by default. In addition, we save the channel information in the "channel" column.

adata.var
n channel marker $PnR $PnB $PnE $PnV $PnG
FSC-A 1 FSC-A 262144 32 0,0 510 1.0
FSC-H 2 FSC-H 262144 32 0,0 510 1.0
FSC-W 3 FSC-W 262144 32 0,0 510 1.0
SSC-A 4 SSC-A 262144 32 0,0 310 1.0
SSC-H 5 SSC-H 262144 32 0,0 310 1.0
SSC-W 6 SSC-W 262144 32 0,0 310 1.0
CD95 7 R660-A CD95 262144 32 0,0 490 1.0
CD8 8 R780-A CD8 262144 32 0,0 475 1.0
CD27 9 B515-A CD27 262144 32 0,0 470 1.0
CXCR4 10 B710-A CXCR4 262144 32 0,0 417 1.0
CCR7 11 V450-A CCR7 262144 32 0,0 400 1.0
LIVE/DEAD 12 V545-A LIVE/DEAD 262144 32 0,0 495 1.0
CD4 13 V605-A CD4 262144 32 0,0 400 1.0
CD45RA 14 V655-A CD45RA 262144 32 0,0 375 1.0
CD3 15 V800-A CD3 262144 32 0,0 400 1.0
CD49B 16 G560-A CD49B 262144 32 0,0 400 1.0
CD14/19 17 G610-A CD14/19 262144 32 0,0 415 1.0
CD69 18 G660-A CD69 262144 32 0,0 470 1.0
CD103 19 G780-A CD103 262144 32 0,0 435 1.0
Time 20 Time 262144 32 0,0 0.01

The .uns['meta'] section contains the header information from the FCS file.

adata.uns["meta"]
{'__header__': {'FCS format': 'FCS3.0',
  'text start': 256,
  'text end': 6333,
  'data start': 6339,
  'data end': 19330498,
  'analysis start': 0,
  'analysis end': 0},
 '$BEGINANALYSIS': '0',
 '$ENDANALYSIS': '0',
 '$BEGINSTEXT': '0',
 '$ENDSTEXT': '0',
 '$BEGINDATA': '6339',
 '$ENDDATA': '19330498           ',
 '$FIL': '2-13-17 T cell Panel_T_E_G05_004.fcs',
 '$SYS': 'Windows 7 6.1',
 '$TOT': 241552,
 '$PAR': 20,
 '$MODE': 'L',
 '$BYTEORD': '4,3,2,1',
 '$DATATYPE': 'F',
 '$NEXTDATA': 0,
 'CREATOR': 'BD FACSDiva Software Version 8.0',
 'TUBE NAME': 'T_E',
 '$SRC': '2-13-17 T cell Panel',
 'EXPERIMENT NAME': 'T_Memory_01-24-17',
 'GUID': '641fcb4b-10df-4636-9325-31d9c563ae6b',
 '$DATE': '10-SEP-2018',
 '$BTIM': '16:02:38',
 '$ETIM': '16:02:38',
 '$CYT': 'LSRFortessa',
 'SETTINGS': 'Cytometer',
 'CYTNUM': 'H64717700086',
 'WINDOW EXTENSION': '10.00',
 'EXPORT USER NAME': 'Administrator',
 'EXPORT TIME': '10-SEP-2018-16:02:38',
 '$OP': 'Administrator',
 'FSC ASF': '0.69',
 'AUTOBS': 'TRUE',
 '$INST': ' ',
 'LASER1NAME': 'Blue',
 'LASER1DELAY': '0.00',
 'LASER1ASF': '0.78',
 'LASER2NAME': 'Green',
 'LASER2DELAY': '129.57',
 'LASER2ASF': '0.75',
 'LASER3NAME': 'Red',
 'LASER3DELAY': '97.14',
 'LASER3ASF': '0.57',
 'LASER4NAME': 'UV',
 'LASER4DELAY': '65.51',
 'LASER4ASF': '0.77',
 'LASER5NAME': 'Violet',
 'LASER5DELAY': '34.39',
 'LASER5ASF': '0.88',
 'PLATE NAME': '2-13-17 T-memory',
 'WELL ID': 'G05',
 'PLATE ID': 'dac8255f-b7a7-4020-97d4-b2f6547e9b8b',
 '$TIMESTEP': '0.01',
 'APPLY COMPENSATION': 'TRUE',
 'THRESHOLD': 'FSC,5000',
 'P1DISPLAY': 'LIN',
 'P1BS': '0',
 'P1MS': '0',
 'P2DISPLAY': 'LIN',
 'P2BS': '0',
 'P2MS': '0',
 'P3BS': '-1',
 'P3MS': '0',
 'P4DISPLAY': 'LIN',
 'P4BS': '0',
 'P4MS': '0',
 'P5DISPLAY': 'LIN',
 'P5BS': '0',
 'P5MS': '0',
 'P6BS': '-1',
 'P6MS': '0',
 'P7DISPLAY': 'LOG',
 'P7BS': '5464',
 'P7MS': '0',
 'P8DISPLAY': 'LOG',
 'P8BS': '157',
 'P8MS': '0',
 'P9DISPLAY': 'LOG',
 'P9BS': '102',
 'P9MS': '0',
 'P10DISPLAY': 'LOG',
 'P10BS': '4284',
 'P10MS': '0',
 'P11DISPLAY': 'LOG',
 'P11BS': '682',
 'P11MS': '0',
 'P12DISPLAY': 'LOG',
 'P12BS': '177',
 'P12MS': '0',
 'P13DISPLAY': 'LOG',
 'P13BS': '2348',
 'P13MS': '0',
 'P14DISPLAY': 'LOG',
 'P14BS': '2322',
 'P14MS': '0',
 'P15DISPLAY': 'LOG',
 'P15BS': '700',
 'P15MS': '0',
 'P16DISPLAY': 'LOG',
 'P16BS': '679',
 'P16MS': '0',
 'P17DISPLAY': 'LOG',
 'P17BS': '4480',
 'P17MS': '0',
 'P18DISPLAY': 'LOG',
 'P18BS': '3799',
 'P18MS': '0',
 'P19DISPLAY': 'LOG',
 'P19BS': '225',
 'P19MS': '0',
 'P20BS': '0',
 'P20MS': '0',
 'CST SETUP STATUS': 'SUCCESS',
 'CST BEADS LOT ID': '74538',
 'CYTOMETER CONFIG NAME': 'Copy of 5 Lasers UV SORP  2B 6V 2UV 3R 5Gr',
 'CYTOMETER CONFIG CREATE DATE': '2014-01-29T14:36:56-08:00',
 'CST SETUP DATE': '2016-12-21T08:52:55-08:00',
 'CST BASELINE DATE': '2016-10-28T10:11:58-07:00',
 'CST BEADS EXPIRED': 'False',
 'CST PERFORMANCE EXPIRED': '2016-12-22T08:52:55-08:00',
 'CST REGULATORY STATUS': 'RUO Performance Check',
 'channels':       $PnN       $PnS    $PnR  $PnB $PnE $PnV  $PnG
 n                                                  
 1    FSC-A             262144    32  0,0  510   1.0
 2    FSC-H             262144    32  0,0  510   1.0
 3    FSC-W             262144    32  0,0  510   1.0
 4    SSC-A             262144    32  0,0  310   1.0
 5    SSC-H             262144    32  0,0  310   1.0
 6    SSC-W             262144    32  0,0  310   1.0
 7   R660-A       CD95  262144    32  0,0  490   1.0
 8   R780-A        CD8  262144    32  0,0  475   1.0
 9   B515-A       CD27  262144    32  0,0  470   1.0
 10  B710-A      CXCR4  262144    32  0,0  417   1.0
 11  V450-A       CCR7  262144    32  0,0  400   1.0
 12  V545-A  LIVE/DEAD  262144    32  0,0  495   1.0
 13  V605-A        CD4  262144    32  0,0  400   1.0
 14  V655-A     CD45RA  262144    32  0,0  375   1.0
 15  V800-A        CD3  262144    32  0,0  400   1.0
 16  G560-A      CD49B  262144    32  0,0  400   1.0
 17  G610-A    CD14/19  262144    32  0,0  415   1.0
 18  G660-A       CD69  262144    32  0,0  470   1.0
 19  G780-A      CD103  262144    32  0,0  435   1.0
 20    Time             262144    32  0,0       0.01,
 'header': {'FCS format': 'FCS3.0',
  'text start': 256,
  'text end': 6333,
  'data start': 6339,
  'data end': 19330498,
  'analysis start': 0,
  'analysis end': 0},
 'spill':                CD95       CD8      CD27     CXCR4      CCR7  LIVE/DEAD  \
 CD95       1.000000  0.097352  0.000000  0.007011  0.003501   0.000000   
 CD8        0.067916  1.000000  0.000000  0.000000  0.023879   0.000257   
 CD27       0.007903  0.000000  1.000000  0.007492  0.010284   0.027712   
 CXCR4      0.054363  0.100434  0.000000  1.000000  0.024458   0.001439   
 CCR7       0.002288  0.000000  0.000000  0.000000  1.000000   0.034874   
 LIVE/DEAD  0.000000  0.000000  0.003884  0.000705  0.014092   1.000000   
 CD4        0.009741  0.000263  0.000000  0.028274  0.080674   0.005858   
 CD45RA     0.275534  0.028670  0.000000  0.015079  0.102571   0.005517   
 CD3        0.022068  0.073814  0.000000  0.000000  0.099510   0.006832   
 CD49B      0.001869  0.000000  0.000000  0.048687  0.002103   0.009783   
 CD14/19    0.006566  0.000262  0.000000  0.177725  0.006049   0.000889   
 CD69       0.191802  0.032179  0.000000  0.396688  0.003517   0.000172   
 CD103      0.005300  0.105676  0.000000  0.016745  0.006381   0.000771   
 
                 CD4    CD45RA       CD3     CD49B   CD14/19      CD69  \
 CD95       0.000354  0.040952  0.008773  0.000067  0.001176  0.181536   
 CD8        0.000000  0.003852  0.100139  0.000877  0.000000  0.008990   
 CD27       0.003897  0.001299  0.000216  0.012664  0.002588  0.000000   
 CXCR4      0.000000  0.056397  0.194799  0.000491  0.000000  0.400014   
 CCR7       0.003729  0.000909  0.000282  0.000107  0.000000  0.000000   
 LIVE/DEAD  0.447288  0.144758  0.025271  0.000000  0.000000  0.000000   
 CD4        1.000000  0.434510  0.085092  0.055112  0.390481  0.290524   
 CD45RA     0.180690  1.000000  0.169154  0.000643  0.014777  0.120247   
 CD3        0.000891  0.003268  1.000000  0.000507  0.000000  0.000000   
 CD49B      0.038672  0.008945  0.001060  1.000000  0.400143  0.148085   
 CD14/19    0.065991  0.022692  0.003962  0.124522  1.000000  0.493361   
 CD69       0.000497  0.026221  0.009804  0.022587  0.010298  1.000000   
 CD103      0.000631  0.000561  0.126363  0.049163  0.018982  0.008683   
 
               CD103  
 CD95       0.005969  
 CD8        0.083250  
 CD27       0.000000  
 CXCR4      0.119091  
 CCR7       0.000000  
 LIVE/DEAD  0.000000  
 CD4        0.012522  
 CD45RA     0.005286  
 CD3        0.027061  
 CD49B      0.004221  
 CD14/19    0.019125  
 CD69       0.050517  
 CD103      1.000000  }

Missing marker column#

In some FCS files, the marker information does not follow the $P[0-9]S pattern, and reading the FCS file might fail. You can set the reindex=False option when reading the FCS files.

adata = pm.io.read_fcs(path_data, reindex=False)
adata
AnnData object with n_obs × n_vars = 241552 × 20
    var: 'channel', 'marker', '$PnR', '$PnB', '$PnE', '$PnV', '$PnG'
    uns: 'meta'

The .var section of the AnnData object contains the channel information. Here we use a running number as var_names. The marker names may be created manually from the channel column.

adata.var
channel marker $PnR $PnB $PnE $PnV $PnG
n
1 FSC-A 262144 32 0,0 510 1.0
2 FSC-H 262144 32 0,0 510 1.0
3 FSC-W 262144 32 0,0 510 1.0
4 SSC-A 262144 32 0,0 310 1.0
5 SSC-H 262144 32 0,0 310 1.0
6 SSC-W 262144 32 0,0 310 1.0
7 R660-A CD95 262144 32 0,0 490 1.0
8 R780-A CD8 262144 32 0,0 475 1.0
9 B515-A CD27 262144 32 0,0 470 1.0
10 B710-A CXCR4 262144 32 0,0 417 1.0
11 V450-A CCR7 262144 32 0,0 400 1.0
12 V545-A LIVE/DEAD 262144 32 0,0 495 1.0
13 V605-A CD4 262144 32 0,0 400 1.0
14 V655-A CD45RA 262144 32 0,0 375 1.0
15 V800-A CD3 262144 32 0,0 400 1.0
16 G560-A CD49B 262144 32 0,0 400 1.0
17 G610-A CD14/19 262144 32 0,0 415 1.0
18 G660-A CD69 262144 32 0,0 470 1.0
19 G780-A CD103 262144 32 0,0 435 1.0
20 Time 262144 32 0,0 0.01