![]() ![]()
|
CHAPTER 3
Statistical Modeling ConstructsSIMPROCESS allows the analyzing of processes using discrete event simulation. This means that SIMPROCESS models systems by taking the processes that happen in the real world and breaking them down into the key events that occur. Parts move to a station in a factory, are processed, and then move to the next station. If one part goes into a station and one part comes out, then the most important aspect of modeling the station would be to capture the processing time (which, in the real world, will vary due to any number of factors) by a statistical distribution.A statistical distribution is used to give the model the randomness that always occurs in the real world. Of course, if things in the real world never varied, simulation would not be needed! Mean-value analysis of your system would suffice. Mean-value analysis is a simple way of predicting system performance by looking only at average rates. Unfortunately, the real world almost never matches the performance predicted by mean-value analysis, because statistical fluctuations almost always need to be taken into account.Expertise in statistics or modeling is not needed to use SIMPROCESS, though; just having some idea of how things vary is enough. If data shows it usually only takes 5 minutes for a clerk to process some paperwork, but it can take as long as 15, that is the beginning of a statistical model of that clerk. Exact answers are not necessary, just general behavior. Introducing a small amount of randomness through simulation can be all that is needed to transform a simplistic mean-value analysis into a realistic model.SIMPROCESS provides a visual flow through the system, in addition to being able to model processes' statistical nature. It is very difficult to know how people, machinery, deliveries, and resources are going to interact in a proposed system. SIMPROCESS shows how these interactions might occur while still in the planning stage. Animation provides valuable insight into how things work and how they could work.Random Number Generation and Standard Distributions
Random Number Generation
SIMPROCESS contains 215 random number streams, each having a different random number seed. You can view the seeds available from Seeds on the View menu.The purpose of using different random number streams in the model is to control variance. This is an advanced topic well beyond the scope of this manual. Please refer to texts on statistics and randomness, including Simulation Modeling and Analysis by Law and Kelton (McGraw-Hill).Standard Distributions
SIMPROCESS includes many standard probability distributions built in, and you can also create empirical distributions from your own data. The following paragraphs discuss selecting distributions and provide brief descriptions of some of the most common standard distributions.Choosing the Right Distribution for Your Data
It is not always clear which of the standard distributions should be used. Fortunately, simulation results are usually not highly dependent on the choice of distributions. A distribution with approximately the right shape should be adequate.ExpertFit (Tools menu) uses sophisticated statistical tests to determine the best fit distribution to a set of experimental data. Additional information on ExpertFit is covered in the ExpertFit documentation, which is located on the Help menu in ExpertFit.ModelFit, which is used with the Auto Fits feature, also fits distributions to empirical data. (See "Auto Fits Distributions" on page 101.)Common Distributions
Poisson: Used to model arrivals where the quantity within a time frame is known. Parameter: Mean.Exponential: Used to model the time between arrivals. Parameter: Mean.Normal: Bell-shaped curve; avoid when mean is near zero. Also, the normal distribution is not usually a good distribution for service times since service times are usually skewed to the right. Parameters: Mean, Standard DeviationLognormal: Can be good distribution for data that is skewed. Parameters: Mean, Standard DeviationTriangular: Useful distribution for data that has the least amount of time, the most likely time, and the most amount of time it takes to perform a task. Parameters: Minimum, Mode, MaximumA complete listing of the statistical distributions available in SIMPROCESS is available in Appendix D. See "Statistical Distributions" on page 477.User Defined Distributions
There are three methods for creating User Defined Distributions listed on the Define pull-down menu. The first, Standard..., customizes an existing SIMPROCESS distribution. The second, Tabular..., creates a statistical distribution from discrete data points using a table format. Auto Fits... automatically fits a distribution from sample data at the beginning of a simulation. The sample data can be in an ASCII file, spreadsheet, or database. ModelFit is used to perform the distribution fitting. (See Help/About SIMPROCESS... for information on ModelFit.) Note that the Auto Fit feature is licensed separately from SIMPROCESS.These User Defined Distributions can be used anywhere in the model where a statistical distribution is specified.Standard Distributions
1. Using the Define Distributions pull-down menu, select Standard.... Use this dialog to Add, Edit, Copy, or Remove User Defined Distributions found in the list box. Undo retrieves a distribution that has been removed.
2. Choosing the Add, Edit, or Copy button brings up another dialog as shown below, where you type in the Name and distribution:Name is any unique non-blank user distribution name.Select a distribution for modification.3. Using the distribution text box, either change the parameters of the distribution within the parentheses or select the details button (three dots). This button opens a dialog containing the parameter descriptions, e.g., Erlang parameters.
4. To see the PDF (Probability Distribution Function) and CDF (Cumulative Distribution Function) plotted, choose the View button from the dialog shown in Step 3. The PDF is labeled on the left y-axis, the CDF on the right y-axis.5. The graph can be saved to a file. To continue, select Close from the graph's File pull-down menu.6. Once the data is entered, choose OK, and the defined User Distribution is added to the list box. This User Defined Distribution can now be used anywhere in the model where a statistical distribution is specified.7. To copy an existing User Distribution, select the distribution, choose the Copy button, and you can enter another Name for the distribution.8. To remove an existing User Distribution, select the distribution, and choose the Remove button.9. Choose the Close button to finish with User Distributions.Tabular Distributions
Tabular Distributions creates a statistical distribution from a table of discrete data points.1. Select Tabular... from the Define Distributions pull-down menu. Use this dialog box to Add, Edit, Copy, or Remove Tabular Distributions in the list box.2. Choosing the Add, Edit, or Copy button brings up another dialog as shown below:
3. Type in the Name any unique non-blank User Distribution name.4. Type is selected from the list box. Choose either a Discrete or Continuous probability distribution function. If Discrete is chosen, only the exact values indicated in the right column will be chosen when the distribution is sampled. If Continuous is chosen, SIMPROCESS will interpolate from the specified Values and the probabilities associated with them.5. To update the table, point and click on the cell you wish to modify. Type the number in the table cell directly.Rows may be added and deleted by clicking on a cell and then choosing the appropriate button on the right. The row will be added above the one selected. The Erase Cell button causes the entry in the selected cell to be erased.A table may also be populated by importing data from a text file using the Import button.Choose the View button to see the PDF and CDF plotted. The graph can be saved to a file. Select Close from the graph's File pull-down menu to continue.6. Once the data is entered, choose OK, and the Tabular Distribution is added to the list box. This Tabular Distribution can now be used anywhere in the model where a statistical distribution is specified.7. To copy an existing Tabular Distribution, select the distribution, choose the Copy button, and enter another Name for the distribution.8. To remove an existing Tabular Distribution, select the distribution, and choose the Remove button.9. Choose the Close button when the Tabular Distributions are complete.Auto Fits Distributions
Creating an Auto Fit Distribution
Auto Fits Distributions creates a statistical distribution from external sample data using ModelFit.1. Select Auto Fits... from the Define Distributions pull-down menu. Use this dialog box to Add, Edit, Copy, or Remove Auto Fits Distributions in the list box.2. Choosing the Add, Edit, or Copy button brings up another dialog as shown below:
3. Type in the Name any unique non-blank User Distribution name.4. There are three tabs (File, Spreadsheet, and Database) that are used to specify the source of the sample data. One or more types of data source can be used. However, only one list of sample data will be fitted. Thus, if more than one data source is used, the data from all selected sources are combined into one list for distribution fitting.5. The File tab is used to specify an ASCII file as the data source. Use File Data Source must be selected to use this type of data source. The Browse button can be used to locate the file. It is recommended that the file be located in the model's directory. Note that the file must contain a single column of numbers with no header and no non-numeric values.6. The Spreadsheet tab specifies a spreadsheet as a data source. The Spreadsheet can be a Workbook file or an XML Spreadsheet. Once Use Spreadsheet Data Source has been selected, four fields are required: File Name, Sheet Name, Starting Row, and Column. The Browse button can be used to locate the spreadsheet file. Again, it is recommended that the file be in the model's directory. Enter the name of the worksheet in the Sheet Name field. Starting Row should have the row number (rows start with 1) that contains the first value. Column should have the column number (column A would be column number 1) of the sample data. The data must be in a single column with no empty cells. The data is read beginning at the Starting Row and continues until an empty cell is reached. Both Starting Row and Column are distribution list fields. Thus, the Starting Row and Column can be parameterized (such asEvl(Model.StartingRow)orEvl(Model.Column)where Model.StartingRow and Model.Column are Model Attributes designated as Model Parameters).
7. The Database tab is used to specify an SQL database as the data source. Use Database Data Source must be selected to use this type of data source. Two items are required: DSN/Properties File and SQL Query. DSN/Properties File must contain the database location/connection information. DSN stands for Data Source Name and is the DSN established in the Windows ODBC Control Panel. (See "Setting Up Database Using Windows and Open Database Connectivity (ODBC)" on page 296 for more information on DSNs.) Properties File is the name of the properties file that contains the JDBC url and driver information and, if necessary, username and password. A Properties File must be used on a non-Windows system and must be located in the model's directory. (See "OpenDatabase" on page 298 for more information on Properties Files.) SQL Query must contain the query that returns the sample data. The query can be parameterized by using model attributes. For example,"Select TIME_VALUE From TimeTable"can be changed to"Select Model.TableColumn From TimeTable". In this example, Model.TableColumn is a STRING Model Attribute that contains the name of the column to read. Note that if the query returns multiple columns, only the first column will be read for sample data.
8. Below the data source section of the dialog are two checkboxes: Set Default Type and Set Default Bounds. These checkboxes allow "hints" about the sample data to be passed to ModelFit. Information about the sample data can be useful in obtaining a proper distribution fit. If Set Default Type is selected (default), then either Continuous (default) or Discrete can be selected. For example, sample data representing time between arrivals is usually Continuous, whereas sample data representing number of arrivals is usually Discrete. If Set Default Bounds is selected, then the lower and upper bounds can be set and passed to ModelFit. Enter Lower Bound Value contains two selections: -Infinity and 0.0. Enter Upper Bound Value contains one selection: Infinity. Other numeric values can be used as bounds by typing the values in the appropriate field. For instance, if the sample data represents task times, then there could be a minimum and/or maximum time to perform the task. Thus, the fitted distribution should reflect those times. In the example below, 5 is the lowest possible value that the fitted distribution should return. This means the task represented by the sample data has a minimum completion time of 5 (minutes, hours, etc.). Note that if neither checkbox is selected, then no information about the sample data is passed to ModelFit.
9. When selected, Always Execute Auto Fit at Simulation Start, causes the data sources to be read and the distribution to be fitted every time the model is run. Note that fitting will also occur at the beginning of a simulation if Always Execute Auto Fit at Simulation Start is not selected and the Fitted Distribution field is empty (Fit Now has not been executed). An error will occur at simulation start if distribution fitting is attempted, and the Auto Fit license has not been activated.10. Stream contains the random number stream (1 - 215) to use when drawing random samples from the Fitted Distribution.11. The View Data button reads the data sources and displays a list of the sample data along with summary statistics. The summary statistics include Number of Samples, Minimum, Maximum, Mean, Standard Deviation, Median, Skewness, and Coefficient of Variation.
Note that if the Spreadsheet data source Starting Row or Column or Database data source SQL Query are parameterized, the View Data button will not return data. A parameterized Spreadsheet or Database data source can only be read at the beginning of a simulation. View Data includes data validation so it can be used to verify the data before a distribution fit.
12. The Fit Now button reads the data sources and calls ModelFit to perform the distribution fitting. If a good fit is obtained, the Fitted Distribution field displays the results. Note that Fit Now will not work if either the Spreadsheet or Database data sources are parameterized (see above). Fit Now includes data and distribution validation so it can be used to verify the data and distribution fit before a simulation run. The example error below shows that ModelFit created a Uniform distribution with a negative lower bound. A negative lower bound is not allowed. (See "Statistical Distributions" on page 477 for information on distribution requirements.) Also, an error will occur if the Fit Now button is clicked and the Auto Fit license has not been activated.
13. The Fitted Distribution field contains the results of an Auto Fit. The contents of the field can be selected and copied, but any changes to the field will not be saved. The details button to the right of the Fitted Distribution field can be used to view the individual parameters and view a plot of the distribution. Any changes made to the distribution parameters on the detail dialog will not be saved.14. ModelFit Alternatives contains alternative distribution fits for the sample data. An empty list means that ModelFit determined that there were no acceptable alternatives. To use an alternative distribution, either select the distribution then click the Select Alternative button or double click the desired alternative. The primary distribution will be placed into the list of alternatives, and the selected alternative will be placed into the Fitted Distribution field. Note that ModelFit Alternatives are only applicable when using the Fit Now button. It is not possible to select an alternative when Auto Fit is run at simulation start. The primary distribution will always be used when Auto Fit is run at the beginning of a simulation. (When running Auto Fit at simulation start, alternatives generated by ModelFit are written to thesimprocess.logfile.) To use an alternative as the primary distribution, Always Execute Auto Fit at Simulation Start must not be selected.
15. ModelFit Warnings contains warning messages from ModelFit concerning the Fitted Distribution. These warnings are only displayed here when the Fit Now button is used. The warnings are written to thesimprocess.logfile (SPSYSTEMdirectory) when fitting occurs at simulation start. None is displayed if there are no warnings. The text of ModelFit Warnings can be selected and copied using the platform specific keyboard shortcuts.
Empirical Distributions
Sometimes ModelFit cannot fit the sample data to one of the SIMPROCESS distributions. This could be due to the sample data representing a distribution that SIMPROCESS does not support, an insufficient sample size, or both. When this happens, ModelFit returns an empirical distribution. The distribution defaults to Continuous unless Discrete is selected in Set Default Type.An empirical distribution is a histogram based distribution and follows the format of the SIMPROCESS Tabular distribution (page 99). The distribution type displayed in the Fitted Distribution field will beEmpfollowed by pairs of cumulative probability and value. The pairs are just comma separated values. In the example below, the first two pairs of cumulative probability and value are (0.0060, 6.557) and (0.015, 7.715).
A table of the values can be displayed by selecting the details button next to the Fitted Distribution field. By selecting the View button, the dialog allows viewing a Continuous or Discrete (based on selected Type) plot of the empirical distribution. Note that changes to the Name, Type, or Stream are not saved.
Run Settings
Run Settings
Simulation Period
The Simulation Period determines how long the simulation will run. It is given in calendar and time format. Choose an appropriate time span to see all aspects of the system.Warmup Every Replication
When selected and the values for both the Number of Replications and Warmup Length are greater than 0, SIMPROCESS will start collecting statistics after the Warmup Length has expired for each replication.Warmup Length
If a Warmup Length is entered SIMPROCESS will start the collection of statistics for the model run, after the end of the Warmup Length. This gathers information on your system after it has reached "Steady State."Warmup Time Unit
The Warmup Time Unit specifies the time unit for the Warmup Length. It defaults to Hours.Simulation Time Unit
The Simulation Time Unit specifies the time unit for the simulation clock. It defaults to Hours.Reset System
Resets the system to the initial conditions before each replication.Verify Model on Run
Turns the automatic model verification on or off at the beginning of a simulation. The verification process checks to make sure all pads are connected.Number of Replications
When the model contains randomness (represented by statistical distributions) the model should be run for multiple replications. This averages the results and gives a more accurate picture of "most likely" outcome of the scenario being modeled.Reports are gathered for each replication and the Sum and Average of all runs.Reset Random Number Streams
This option resets the Random Number Stream before starting each new replication. Typically, this should not be selected. In general terms, the reason for running the model for multiple replications is to test how randomness affects the system. Turning this option on negates that test.RMI Host
This indicates the host on which SPServer will be located if the model has an External schedule defined. Defaults to "localhost." (See "Adding an External Schedule" on page 335.)RMI Port
This indicates the TCP port to use on the RMI Host. Defaults to 1099. (See "Adding an External Schedule" on page 335.Cost Periods
This tab sets the frequency over which the ABC Reports will be collected. The name of the currency that the ABC Reports will use can also be set. (See "Setting Up Cost Periods" on page 179.)Debug Traces
This tab turns on/off debug tracing and defines debug traces. If checked, Create Entity Debug Traces causes the creation of the debug traces defined in the box to the right. If no debug traces are defined, no debug traces are created during the simulation run. Click the Add... button to define a debug trace. This brings up a dialog for defining the debug trace.
Debug Trace Name is the name of the trace that will appear in the list of traces. Debug Trace File is the name of the debug trace file to be created. If no path is included, the file will be created in the directory of the model. Select the Entities to be included in the trace. Entity types to be included should be moved from the list of entities on the left to the list of entities on the right. The checkboxes at the bottom are the Entity actions that can be included in the trace. Note that the Clone, Assemble, Split, and Transform actions apply to the Entity that initiates the action, not the Entity or Entities that result from the action.Each debug trace file is a tab delimited file. The replication is added to the file (Replication 1, Replication 2, etc.) and then a header line consisting of Action, Activity, Entity, SequenceNum, and SimTime. Each action added to the file consists of the action name (Initialize, Accept, etc.), the name of the activity, the name of the entity, the sequence number of the entity, and the simulation time.Expression Output
The SIMPROCESS Expression Language has a statement calledOUTPUT. This statement displays information in the SIMPROCESS Expression Output Dialog. (See "Expression Language Statements" on page 257.)OUTPUTis especially useful for debugging. The Expression Output tab offers options for theOUTPUTstatement. (Note that these options also apply to theShowSystemAttributesandShowUserAttributesSystem Methods. See "SIMPROCESS System Methods" on page 521.)
There are two options: Suppress Dialog and Send Output to File. By default, both options are not selected. If selected, the Suppress Dialog option prohibits the SIMPROCESS Expression Output Dialog from displaying during a simulation. (Suppressing the dialog causes the simulation to run faster.) If Send Output to File is selected, theOUTPUTstatements are written to the file entered in the adjoining text field. Since the file defaults to the model's directory, the file name entered must not contain a path.
These options apply when running simulations using the Experiment Manager (see "Experiment Manager" on page 387). However, Suppress Dialog has no affect on the Experiment Manager Status Messages dialog.SIMPROCESS models can be run without the SIMPROCESS Graphical User Interface (see "Running Models Without GUI" on page 599). The optimization tool OptQuest (see "OptQuest for SIMPROCESS" on page 401) and the SIMPROCESS Dispatcher do this. (The SIMPROCESS Dispatcher is an optional feature that allows SIMPROCESS to be used with a Web service. If the SIMPROCESS Dispatcher was installed, seeSIMPROCESS Dispatcher.pdfin thedispatcherdirectory for more information.) If Suppress Dialog and Send Output to File are not selected, thenOUTPUTstatements are written to the appropriate log file. In most cases, the log file will besimprocess.login theSPSYSTEMdirectory. When using OptQuest, the log file isoptQuest.login theSPSYSTEMdirectory. The Dispatcher log file isDispatcher.log. Its location is dependent on where the Web service is located. If either Suppress Dialog or Send Output to File is selected, then no information is sent to a log file. When Send Output to File is selected, theOUTPUTstatements are written to the designated file.Whether running a model with or without the SIMPROCESS GUI, the file designated in Send Output to File will contain output for all replications; however, when the same model is run again, the file will be overwritten. So if a user, the Experiment Manager, OptQuest, or the SIMPROCESS Dispatcher runs the same model multiple times, the file will only contain information from the last run.Time Server
SIMPROCESS models have the ability to let an outside agent manage the advancement of simulation time. The TimeServer is a separate application which can manage simulation time for multiple participants. It can be run on the same machine as SIMPROCESS or on another system. SeeSIMPROCESS TimeServer.pdfin thetimeserverdirectory for more information, including how to configure and run the TimeServer application. Thetimeserverdirectory is in the SIMPROCESS installation directory.The TimeServer supports one or more groups of players. Each group has a unique name among those handled by a single instance of the TimeServer, and each group manages its own time independently of all other groups. Thus, for a SIMPROCESS model to use the TimeServer, it must join a predefined TimeServer group as a player.Time Server Settings
The Use Time Server option is initially not selected. When selected, IP Address, Port, Group, Player Name, and Synchronize to Nearest become enabled. Values for IP Address, Port, Group, and Player Name are required when using the TimeServer while Synchronize to Nearest is optional.
IP Address defaults tolocalhost, which indicates that the TimeServer will be running on the same system as SIMPROCESS. If the TimeServer will be running on a different system, then the IP address or host name of that system should be entered.Port can be between 1024 and 65535 inclusive and defaults to 26100.Group is the name of the TimeServer group that this model will join. Groups are defined as part of the TimeServer configuration. SeeSIMPROCESS TimeServer.pdfin thetimeserverdirectory for more information.Player Name is the name by which the simulation will be known to the TimeServer Group. This name cannot be duplicated within a Group.In the example below, at the start of the simulation, the model will join the groupCallCenterasCallCenter 1.
During simulation SIMPROCESS creates events internally, each bearing a time stamp indicating the simulation time when they will be processed. Events are placed into a queue to be processed in the order of their simulation times. A large and complex model will handle a great many such events as it simulates.When a TimeServer is used to manage simulation time externally, an additional step is introduced into this process. Before any event can be processed, SIMPROCESS must have gained approval from the TimeServer to advance to or beyond that event's time. Therefore, SIMPROCESS must recognize when it has events scheduled to occur later than the most recently approved time and send a request for time advancement to the TimeServer.When Synchronize to Nearest is not selected, this event handling occurs naturally, as dictated by the simulation times of each event and the responses from the TimeServer. This is referred to as full synchronization. Events may occur with very small amounts of time between them. Large and complex models can send a significant number of time advancement requests to the TimeServer. Depending on the number of players participating in the simulation group, and the complexity of their respective simulations, this can result in a great deal of time spent simply waiting for time advancement approval and can therefore cause overall simulation performance to suffer.Selecting Synchronize to Nearest can sometimes improve this situation. When selected, it enables the list for selecting an available time unit, which defaults to Hours. (All valid SIMPROCESS time units are available.) When used, SIMPROCESS will round all its requests for time advancement to integer numbers in the selected time unit (this time unit need not match the Simulation Time Unit, discussed on page 109, though it should usually not be a smaller unit). This is referred to as rounded synchronization.
As an example, assume that the current simulation time is 1.0 Hours and that the next event in the simulation is scheduled to occur at 1.25 Hours. Instead of sending a request for time advancement to 1.25 Hours, a request for 2.0 Hours will be sent. This is because 2 is the smallest integer value greater than 1.25. If the TimeServer approves the request by sending a time advance notification of 2.0 Hours back to the SIMPROCESS model, then simulation events with times less than or equal to 2.0 Hours will not require sending new time advance requests. So if the next three events had times of 1.5, 1.75, and 2.0 Hours, respectively, those events would not cause new time advance requests to be sent to the TimeServer, thus potentially improving performance by reducing the overall number of requests sent to the TimeServer and enabling SIMPROCESS to continue processing events without delay. (If a shorter time were approved instead, SIMPROCESS would process its events up to the newly approved time and submit another rounded time advance request.)Note that the effectiveness of rounded synchronization will depend in part on the strategies employed by the other players in the group. If the other players are using full synchronization, or are using rounded synchronization with a smaller time unit, there may little or no benefit.Time Server Errors
Various errors can cause the premature end of the initialization or run of a simulation that is using the TimeServer.· TimeServer is not running· TimeServer stops· Invalid Group· Group is already full (all players have joined)· Group is reinitialized· Duplicate player name· Communication error· Another player terminates with an error condition (there is an option to continue in this case)Time Server Considerations
When another player completes normally before the completion of the SIMPROCESS simulation, the simulation will continue until it is finished. However, if another player terminates for a different reason, SIMPROCESS will ask whether to continue or stop the simulation.A SIMPROCESS model should not use External schedules along with a TimeServer. This can lead to unpredictable results. See "Adding an External Schedule," beginning on page 335 for more information on External schedules.It is not strictly necessary for all SIMPROCESS players to use the same Number of Replications or to have the same setting for Reset System (both discussed on page 110). However, it's strongly suggested that all SIMPROCESS players in a group use the same Number of Replications and Reset System settings.
|
Quadralay Corporation http://www.webworks.com Voice: (512) 719-3399 Fax: (512) 719-3606 sales@webworks.com |
![]() ![]()
|