org.apache.derby.impl.store.access.sort
Class ExternalSortFactory

java.lang.Object
  extended byorg.apache.derby.impl.store.access.sort.ExternalSortFactory
All Implemented Interfaces:
MethodFactory, ModuleControl, ModuleSupportable, SortCostController, SortFactory

public class ExternalSortFactory
extends java.lang.Object
implements SortFactory, ModuleControl, ModuleSupportable, SortCostController


Field Summary
protected static int DEFAULT_MAX_MERGE_RUN
           
protected static int DEFAULT_MEM_USE
           
private static int DEFAULT_SORTBUFFERMAX
           
private  int defaultSortBufferMax
           
private  UUID formatUUID
           
private static java.lang.String FORMATUUIDSTRING
           
private static java.lang.String IMPLEMENTATIONID
           
private static int MINIMUM_SORTBUFFERMAX
           
private static int SORT_ROW_OVERHEAD
           
private  int sortBufferMax
           
private  boolean userSpecified
           
 
Fields inherited from interface org.apache.derby.iapi.store.access.conglomerate.SortFactory
MODULE
 
Constructor Summary
ExternalSortFactory()
           
 
Method Summary
 void boot(boolean create, java.util.Properties startParams)
          Boot this module with the given properties.
 boolean canSupport(java.util.Properties startParams)
          See if this implementation can support any attributes that are listed in properties.
 void close()
          Close the controller.
 Sort createSort(TransactionController tran, int segment, java.util.Properties implParameters, DataValueDescriptor[] template, ColumnOrdering[] columnOrdering, SortObserver sortObserver, boolean alreadyInOrder, long estimatedRows, int estimatedRowSize)
          Create a sort.
 java.util.Properties defaultProperties()
          There are no default properties for the external sort..
 double getSortCost(DataValueDescriptor[] template, ColumnOrdering[] columnOrdering, boolean alreadyInOrder, long estimatedInputRows, long estimatedExportRows, int estimatedRowSize)
          Short one line description of routine.
 SortCostController openSortCostController()
          Return an open SortCostController.
 UUID primaryFormat()
          Return the primary format that this access method supports.
 java.lang.String primaryImplementationType()
          Return the primary implementation type for this access method.
 void stop()
          Stop the module.
 boolean supportsFormat(UUID formatid)
          Return whether this access method supports the format supplied in the argument.
 boolean supportsImplementation(java.lang.String implementationId)
          Return whether this access method implements the implementation type given in the argument string.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

userSpecified

private boolean userSpecified

defaultSortBufferMax

private int defaultSortBufferMax

sortBufferMax

private int sortBufferMax

IMPLEMENTATIONID

private static final java.lang.String IMPLEMENTATIONID
See Also:
Constant Field Values

FORMATUUIDSTRING

private static final java.lang.String FORMATUUIDSTRING
See Also:
Constant Field Values

formatUUID

private UUID formatUUID

DEFAULT_SORTBUFFERMAX

private static final int DEFAULT_SORTBUFFERMAX
See Also:
Constant Field Values

MINIMUM_SORTBUFFERMAX

private static final int MINIMUM_SORTBUFFERMAX
See Also:
Constant Field Values

DEFAULT_MEM_USE

protected static final int DEFAULT_MEM_USE
See Also:
Constant Field Values

DEFAULT_MAX_MERGE_RUN

protected static final int DEFAULT_MAX_MERGE_RUN
See Also:
Constant Field Values

SORT_ROW_OVERHEAD

private static final int SORT_ROW_OVERHEAD
See Also:
Constant Field Values
Constructor Detail

ExternalSortFactory

public ExternalSortFactory()
Method Detail

defaultProperties

public java.util.Properties defaultProperties()
There are no default properties for the external sort..

Specified by:
defaultProperties in interface MethodFactory
See Also:
MethodFactory.defaultProperties()

supportsImplementation

public boolean supportsImplementation(java.lang.String implementationId)
Description copied from interface: MethodFactory
Return whether this access method implements the implementation type given in the argument string.

Specified by:
supportsImplementation in interface MethodFactory
See Also:
MethodFactory.supportsImplementation(java.lang.String)

primaryImplementationType

public java.lang.String primaryImplementationType()
Description copied from interface: MethodFactory
Return the primary implementation type for this access method. Although an access method may implement more than one implementation type, this is the expected one. The access manager will put the primary implementation type in a hash table for fast access.

Specified by:
primaryImplementationType in interface MethodFactory
See Also:
MethodFactory.primaryImplementationType()

supportsFormat

public boolean supportsFormat(UUID formatid)
Description copied from interface: MethodFactory
Return whether this access method supports the format supplied in the argument.

Specified by:
supportsFormat in interface MethodFactory
See Also:
MethodFactory.supportsFormat(org.apache.derby.catalog.UUID)

primaryFormat

public UUID primaryFormat()
Description copied from interface: MethodFactory
Return the primary format that this access method supports. Although an access method may support more than one format, this is the usual one. the access manager will put the primary format in a hash table for fast access to the appropriate method.

Specified by:
primaryFormat in interface MethodFactory
See Also:
MethodFactory.primaryFormat()

createSort

public Sort createSort(TransactionController tran,
                       int segment,
                       java.util.Properties implParameters,
                       DataValueDescriptor[] template,
                       ColumnOrdering[] columnOrdering,
                       SortObserver sortObserver,
                       boolean alreadyInOrder,
                       long estimatedRows,
                       int estimatedRowSize)
                throws StandardException
Create a sort. This method could choose among different sort options, depending on the properties etc., but currently it always returns a merge sort.

Specified by:
createSort in interface SortFactory
Throws:
StandardException - if the sort could not be opened for some reason, or if an error occurred in one of the lower level modules.
See Also:
SortFactory.createSort(org.apache.derby.iapi.store.access.TransactionController, int, java.util.Properties, org.apache.derby.iapi.types.DataValueDescriptor[], org.apache.derby.iapi.store.access.ColumnOrdering[], org.apache.derby.iapi.store.access.SortObserver, boolean, long, int)

openSortCostController

public SortCostController openSortCostController()
                                          throws StandardException
Return an open SortCostController.

Return an open SortCostController which can be used to ask about the estimated costs of SortController() operations.

Specified by:
openSortCostController in interface SortFactory
Returns:
The open SortCostController.
Throws:
StandardException - Standard exception policy.
See Also:
SortCostController

close

public void close()
Description copied from interface: SortCostController
Close the controller.

Close the open controller. This method always succeeds, and never throws any exceptions. Callers must not use the StoreCostController after closing it; they are strongly advised to clear out the StoreCostController reference after closing.

Specified by:
close in interface SortCostController

getSortCost

public double getSortCost(DataValueDescriptor[] template,
                          ColumnOrdering[] columnOrdering,
                          boolean alreadyInOrder,
                          long estimatedInputRows,
                          long estimatedExportRows,
                          int estimatedRowSize)
                   throws StandardException
Short one line description of routine.

The sort algorithm is a N * log(N) algorithm. The following numbers on a PII, 400 MHZ machine, jdk117 with jit, insane.zip. This test is a simple "select * from table order by first_int_column. I then subtracted the time it takes to do "select * from table" from the result. number of rows elaspsed time in seconds -------------- ----------------------------- 1000 0.20 10000 10.5 100000 80.0 We assume that the formula for sort performance is of the form: performance = K * N * log(N). Solving the equation for the 1000 and 100000 case we come up with: performance = 1 + 0.08 N ln(n) NOTE: Apparently, these measurements were done on a faster machine than was used for other performance measurements used by the optimizer. Experiments show that the 0.8 multiplier is off by a factor of 4 with respect to other measurements (such as the time it takes to scan a conglomerate). I am correcting the formula to use 0.32 rather than 0.08. - Jeff

RESOLVE (mikem) - this formula is very crude at the moment and will be refined later. known problems: 1) internal vs. external sort - we know that the performance of sort is discontinuous when we go from an internal to an external sort. A better model is probably a different set of contants for internal vs. external sort and some way to guess when this is going to happen. 2) current row size is never considered but is critical to performance. 3) estimatedExportRows is not used. This is a critical number to know if an internal vs. an external sort will happen.

Specified by:
getSortCost in interface SortCostController
Parameters:
template - A row which is prototypical for the sort. All rows inserted into the sort controller must have exactly the same number of columns as the template row. Every column in an inserted row must have the same type as the corresponding column in the template.
columnOrdering - An array which specifies which columns participate in ordering - see interface ColumnOrdering for details. The column referenced in the 0th columnOrdering object is compared first, then the 1st, etc.
alreadyInOrder - Indicates that the rows inserted into the sort controller will already be in order. This is used to perform aggregation only.
estimatedInputRows - The number of rows that the caller estimates will be inserted into the sort. This number must be >= 0.
estimatedExportRows - The number of rows that the caller estimates will be exported by the sorter. For instance if the sort is doing duplicate elimination and all rows are expected to be duplicates then the estimatedExportRows would be 1. If no duplicate eliminate is to be done then estimatedExportRows would be the same as estimatedInputRows. This number must be >= 0.
estimatedRowSize - The estimated average row size of the rows being sorted. This is the client portion of the rowsize, it should not attempt to calculate Store's overhead. -1 indicates that the caller has no idea (and the sorter will use 100 bytes in that case. Used by the sort to make good choices about in-memory vs. external sorting, and to size merge runs. The client is not expected to estimate the per column/ per row overhead of raw store, just to make a guess about the storage associated with each row (ie. reasonable estimates for some implementations would be 4 for int, 8 for long, 102 for char(100), 202 for varchar(200), a number out of hat for user types, ...).
Returns:
The identifier to be used to open the conglomerate later.
Throws:
StandardException - Standard exception policy.

canSupport

public boolean canSupport(java.util.Properties startParams)
Description copied from interface: ModuleSupportable
See if this implementation can support any attributes that are listed in properties. This call may be made on a newly created instance before the boot() method has been called, or after the boot method has been called for a running module.

The module can check for attributes in the properties to see if it can fulfill the required behaviour. E.g. the raw store may define an attribute called RawStore.Recoverable. If a temporary raw store is required the property RawStore.recoverable=false would be added to the properties before calling bootServiceModule. If a raw store cannot support this attribute its canSupport method would return null. Also see the Monitor class's prologue to see how the identifier is used in looking up properties.
Actually a better way maybe to have properties of the form RawStore.Attributes.mandatory=recoverable,smallfootprint and RawStore.Attributes.requested=oltp,fast

Specified by:
canSupport in interface ModuleSupportable
Returns:
true if this instance can be used, false otherwise.

boot

public void boot(boolean create,
                 java.util.Properties startParams)
          throws StandardException
Description copied from interface: ModuleControl
Boot this module with the given properties. Creates a module instance that can be found using the findModule() methods of Monitor. The module can only be found using one of these findModule() methods once this method has returned.

An implementation's boot method can throw StandardException. If it is thrown the module is not registered by the monitor and therefore cannot be found through a findModule(). In this case the module's stop() method is not called, thus throwing this exception must free up any resources.

When create is true the contents of the properties object will be written to the service.properties of the persistent service. Thus any code that requires an entry in service.properties must explicitly place the value in this properties set using the put method.
Typically the properties object contains one or more default properties sets, which are not written out to service.properties. These default sets are how callers modify the create process. In a JDBC connection database create the first set of defaults is a properties object that contains the attributes that were set on the jdbc:derby: URL. This attributes properties set has the second default properties set as its default. This set (which could be null) contains the properties that the user set on their DriverManager.getConnection() call, and are thus not owned by cloudscape code, and thus must not be modified by cloudscape code.

When create is false the properties object contains all the properties set in the service.properties file plus a limited number of attributes from the JDBC URL attributes or connection properties set. This avoids properties set by the user compromising the boot process. An example of a property passed in from the JDBC world is the bootPassword for encrypted databases.

Code should not hold onto the passed in properties reference after boot time as its contents may change underneath it. At least after the complete boot is completed, the links to all the default sets will be removed.

Specified by:
boot in interface ModuleControl
Throws:
StandardException - Module cannot be started.
See Also:
Monitor, ModuleFactory

stop

public void stop()
Description copied from interface: ModuleControl
Stop the module. The module may be found via a findModule() method until some time after this method returns. Therefore the factory must be prepared to reject requests to it once it has been stopped. In addition other modules may cache a reference to the module and make requests of it after it has been stopped, these requests should be rejected as well.

Specified by:
stop in interface ModuleControl
See Also:
Monitor, ModuleFactory


Apache Derby V10.0 Engine Documentation - Copyright © 1997,2004 The Apache Software Foundation or its licensors, as applicable.