Load the sample data.

This simulated data is from a manufacturing company that operates 50 factories across the world, with each factory running a batch process to create a finished product. The company wants to decrease the number of defects in each batch, so it developed a new manufacturing process. To test the effectiveness of the new process, the company selected 20 of its factories at random to participate in an experiment: Ten factories implemented the new process, while the other ten continued to run the old process. In each of the 20 factories, the company ran five batches (for a total of 100 batches) and recorded the following data:

Flag to indicate whether the batch used the new process (`newprocess`

)

Processing time for each batch, in hours (`time`

)

Temperature of the batch, in degrees Celsius (`temp`

)

Categorical variable indicating the supplier (`A`

, `B`

, or `C`

) of the chemical used in the batch (`supplier`

)

Number of defects in the batch (`defects`

)

The data also includes `time_dev`

and `temp_dev`

, which represent the absolute deviation of time and temperature, respectively, from the process standard of 3 hours at 20 degrees Celsius.

Fit a generalized linear mixed-effects model using `newprocess`

, `time_dev`

, `temp_dev`

, and `supplier`

as fixed-effects predictors. Include a random-effects term for intercept grouped by `factory`

, to account for quality differences that might exist due to factory-specific variations. The response variable `defects`

has a Poisson distribution, and the appropriate link function for this model is log. Use the Laplace fit method to estimate the coefficients. Specify the dummy variable encoding as `'effects'`

, so the dummy variable coefficients sum to 0.

The number of defects can be modeled using a Poisson distribution:

$${\text{defects}}_{ij}\sim \text{Poisson}({\mu}_{ij})$$

This corresponds to the generalized linear mixed-effects model

$$\mathrm{log}({\mu}_{ij})={\beta}_{0}+{\beta}_{1}{\text{newprocess}}_{ij}+{\beta}_{2}{\text{time}\text{\_}\text{dev}}_{ij}+{\beta}_{3}{\text{temp}\text{\_}\text{dev}}_{ij}+{\beta}_{4}{\text{supplier}\text{\_}\text{C}}_{ij}+{\beta}_{5}{\text{supplier}\text{\_}\text{B}}_{ij}+{b}_{i},$$

where

$${\text{defects}}_{ij}$$ is the number of defects observed in the batch produced by factory $$i$$ during batch $$j$$.

$${\mu}_{ij}$$ is the mean number of defects corresponding to factory $$i$$ (where $$i=1,2,...,20$$) during batch $$j$$ (where $$j=1,2,...,5$$).

$${\text{newprocess}}_{ij}$$, $${\text{time}\text{\_}\text{dev}}_{ij}$$, and $${\text{temp}\text{\_}\text{dev}}_{ij}$$ are the measurements for each variable that correspond to factory $$i$$ during batch $$j$$. For example, $${\text{newprocess}}_{ij}$$ indicates whether the batch produced by factory $$i$$ during batch $$j$$ used the new process.

$${\text{supplier}\text{\_}\text{C}}_{ij}$$ and $${\text{supplier}\text{\_}\text{B}}_{ij}$$ are dummy variables that use effects (sum-to-zero) coding to indicate whether company `C`

or `B`

, respectively, supplied the process chemicals for the batch produced by factory $$i$$ during batch $$j$$.

$${b}_{i}\sim N(0,{\sigma}_{b}^{2})$$ is a random-effects intercept for each factory $$i$$ that accounts for factory-specific variation in quality.

Predict the response values at the original design values.

Create a new table by copying the first 10 rows of `mfr`

into `tblnew`

.

The first 10 rows of `mfr`

include data collected from trials 1 through 5 for factories 1 and 2. Both factories used the old process for all of their trials during the experiment, so `newprocess = 0`

for all 10 observations.

Change the value of `newprocess`

to `1`

for the observations in `tblnew`

.

Compute predicted response values and nonsimultaneous 99% confidence intervals using `tblnew`

. Display the first 10 rows of the predicted values based on `tblnew`

, the predicted values based on `mfr`

, and the observed response values.

ans = *10×3*
3.4536 4.9883 6.0000
4.1142 5.9423 7.0000
3.5530 5.1318 6.0000
3.8976 5.6295 5.0000
3.7040 5.3499 6.0000
3.6095 5.2134 5.0000
3.2146 4.6430 4.0000
3.1393 4.5342 4.0000
3.7320 5.3903 9.0000
3.2214 4.6529 4.0000

Column 1 contains predicted response values based on the data in `tblnew`

, where `newprocess = 1`

. Column 2 contains predicted response values based on the original data in `mfr`

, where `newprocess = 0`

. Column 3 contains the observed response values in `mfr`

. Based on these results, if all other predictors retain their original values, the predicted number of defects appears to be smaller when using the new process.

Display the 99% confidence intervals for rows 1 through 10 corresponding to the new predicted response values.

ans = *10×2*
1.6983 7.0235
1.9191 8.8201
1.8735 6.7380
2.0149 7.5395
1.9034 7.2079
1.8918 6.8871
1.6776 6.1597
1.5404 6.3976
1.9574 7.1154
1.6892 6.1436