SDC = 1.96 × 2 ×SEM SEM = σ× 1 − ICC

(1)

Supplementary materials

Supplementary S1. Analysis methods of the lung ultrasound examinations

All examinations were performed by five investigators using a COVID-19 unit-restricted SonoSite-Edge II ultrasound machine. All measurements were performed using a 10-5 MHz linear transducer or 5-3 MHz curvilinear transducer with lung examination setting and a depth of >6 centimeters.

Offline analyses of all ultrasound images of 191 examinations were performed by two investigators (MLAH and AWEL) blinded to the patient’s baseline characteristics. The investigators determined the involvement per zone [1-3]:

0 = A-line pattern; 1 = Well-separated B-lines; 2 = Confluent B-lines; 3 = Consolidation.

In order to appropriately compare pulmonary involvement across protocols, a LUS index (LUSI = (total LUS / total LUS achievable) × 100) was calculated.

Supplementary S2. The calculation method for the smallest detectable change

Systematic and random error in measurements produce a difference in LUS that is not attributed to true changes in pulmonary involvement. These measurement errors can be quantified as the standard error of measurement (SEM). One can obtain a SEM by calculating a two-way mixed effects intraclass correlation coefficient model for absolute agreement (ICC) and using the following formula [4]:

SEM_interrater=σ ×

√

¹⁻^ICC^interrater

The SEM can then be used to calculate the smallest detectable change (SDC), which represents the minimal change a score must show to ensure that the observed change is true and not a result of measurement error. The following formula can be used:

SDC_interrater=1.96×

√

^{2× SEM}interrater

A power calculation was performed to determine the required sample size for the ICC. A minimum acceptable reliability of 0.65, with an expected reliability of 0.89 based on previous research [5], a power of 0.90, and a significance level of 0.05 resulted in a sample size of 27 examinations for two raters.

Twenty-seven examinations were then selected from the total sample of 191 using a random number generator. These examinations were evaluated by both investigators (MLAH and AWEL). The interrater of the investigators ICC was 0.870, whereas the mean and standard deviation were 66.6±17.6. As a result the SEM was 6.3, and the SDC 17.4%. The 95% confidence interval for the SDC was 11.8-26.1%.

The Bland-Altman plot was created in accordance with previous literature [6]. In a linear model of the difference (Bland-Altman Y axis) as a function of the mean (Bland-Altman X-axis), the coefficient of the mean was 0.04, with a P-value > 0.05, indicating that the proportionality of the bias was neither significant nor clinically relevant. The constant bias was 1.9 with a 95% confidence interval of 1.12- 2.69 and the limits of agreement were 10.8 with a 95% confidence interval of 7.4-14.2.

(2)

The comparison between SDC and limits of agreement was estimated from 10,000 seeded bootstrapped comparisons in R language for statistical computing with the tidyverse suite of packages. The resulting p-value was 0.019.

(3)

References for supplementary materials

1. Bouhemad B, Liu ZH, Arbelot C et al (2010) Ultrasound assessment of antibiotic-induced pulmonary reaeration in ventilator-associated pneumonia. Crit Care Med 38(1):84-92.

https://doi.org/10.1097/CCM.0b013e3181b08cdb

2. Mayo PH, Copetti R, Feller-Kopman D et al (2019) Thoracic ultrasonography: a narrative review.

Intensive Care Med 45(9):1200-1211. https://doi.org/10.1007/s00134-019-05725-8.

3. Heldeweg M, Matta J, Haaksma M et al. (2020) Lung ultrasound and computed tomography to monitor COVID-19 pneumonia in critically ill patients: a two-center prospective cohort study.

Intensive Care Med Exp 25;9(1):1. https://doi.org/10.1186/s40635-020-00367-3

4. de Vet HC, Terwee CB, Knol DL et al (2006) When to use agreement versus reliability measures. J Clin Epidemiol 59(10):1033-9. https://doi.org/10.1016/j.jclinepi.2005.10.015

5. Lieveld AW, Kok B, Schuit FH et al (2020) Diagnosing COVID-19 pneumonia in a pandemic setting:

Lung Ultrasound versus CT (LUVCT) – a multicentre, prospective, observational study. ERJ Open Res 6(4): 00539-2020. https://doi.org/10.1183/23120541.00539-2020

6. Bland JM, Altman DG (1986) Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 8;1(8476):307-10. https://doi.org/ 10.1016/S0140-6736(86)90837-8