Age | Commit message (Collapse) | Author |
|
RootExec root wasn't initialized
closes #1506
|
|
execution
closes #1455
|
|
|
|
|
|
when terminating the JVM.
closes #1306
|
|
connection instead of submitting locally
closes #1253
|
|
closes #1207
|
|
close apache/drill#1105
|
|
a downstream causing query to hang
closes #1151
|
|
closes #1045
|
|
FragmentSetupException for completed/cancelled fragments
This closes #1041
|
|
Note: Resolved Merge Conflict and added certain new tests
closes #919
|
|
when Drillbit to Drillbit Control Connection has network issues
Note: 1) To resolve the issue all the fragments including root fragment which are assigned to be executed on Foreman node
are scheduled locally and not sent over Control Tunnel. Also the FragmentStatusReporter is updated to sent the
status update locally by fragments running on Foreman node.
2) Refactor for FragmentManager, setupRootFragment and startNewFragment
3) Update the test added for DRILL-5701 as there is change in behavior
|
|
status updates to Foreman
closes #934
|
|
closes #926
|
|
|
|
+ Support authentication in ControlServer and ControlClient
+ Add AuthenticationCommand as an initial command after handshake
and before the command that initiates a connection
+ Add ControlConnectionConfig to encapsulate configuration
+ ControlMessageHandler now implements RequestHandler
control
|
|
+ Extract RemoteConnection interface, and add AbstractRemoteConnection
+ Add ServerConnection and ClientConnection interfaces
+ Add RequestHandler interface to decouple connections from how requests are handled
+ Add NonTransientRpcException
+ Remove unused classes and methods
+ Code style changes
|
|
rpc threads
closes #561
|
|
- Replace Stopwatch constructors with .createStarted() or .createUnstarted()
- Stop using InputSupplier and Closeables.closeQuietly
- Clean up quiet closes to log or (preferably) propagate.
- Add log4j to enforcer exclusions.
- Update HBaseTestSuite to add patching of Closeables.closeQuietly() and Stopwatch legacy methods. Only needed when running HBaseMiniCluster.
- Remove log4j from HBase's pom to provide exception logging.
- Remove log4j from Hive's shaded pom.
- Update Catastrophic failures to use the same pattern to ensure reporting.
- Update test framework to avoid trying IPv6 resolution. (This removes 90s pause from HBase startup in my tests)
This closes #361.
This closes #157.
|
|
- make Allocator mostly lockless
- change BaseAllocator maps to direct references
- add documentation around memory management model
- move transfer and ownership methods to DrillBuf
- Improve debug messaging.
- Fix/revert sort changes
- Remove unused fragment limit flag
- Add time to HistoricalLog events
- Remove reservation amount from RootAllocator constructor (since not allowed)
- Fix concurrency issue where allocator is closing at same moment as incoming batch transfer, causing leaked memory and/or query failure.
- Add new AutoCloseables.close(Iterable<AutoCloseable>)
- Remove extraneous DataResponseHandler and Impl (and update TestBitRpc to use smarter mock of FragmentManager)
- Remove the concept of poison pill record batches, using instead FragmentContext.isOverMemoryLimit()
- Update incoming data batches so that they are transferred under protection of a close lock
- Improve field names in IncomingBuffers and move synchronization to collectors as opposed to IncomingBuffers (also change decrementing to decrementToZero rather than two part check).
This closes #238.
|
|
- Extract Accountor interface from Implementation
- Separate FMPP modules to separate out Vector Needs versus external needs
- Separate out Vector classes from those that are VectorAccessible.
- Cleanup Memory Exception hiearchy
|
|
secondary thread.
- Create a separate serialized executor for fragment receiverFinished events.
- Update serialized executor to pool object creation.
- Ensure that FragmentExecutor acceptExternalEvents countdown occurs when only execution is cancellation.
|
|
- Formatting
- @Overrides
- finals
- some AutoCloseable additions
- new isCancelled() abstract method on FragmentManager, implemented on subclasses
Added missing new abstract method isCancelled()
Close apache/drill#120
|
|
NonRootStatusReporter to FragmentStatusReporter
+ Removed StatusReporter interface
+ Refactored FragmentStatusReporter
|
|
|
|
- each time a fragment A sends a "receiver finished" to fragment B, fragment B id will be
added to FragmentContext.ignoredSenders list
- refactored UnorderedReceiverBatch.informSenders() and MergingRecordBatch.informSenders()
by moving this method to FragmentContext
- DataServer.send() uses FragmentContext.ignoredSenders to decide if a batch should be
passed to the fragment or discarded right away
- BaseRawBatchBuffer methods enqueue() and kill() are now synchronized
- TestTpcdsSf1Leak test reproduces the leak, it's ignored by default because it requires
a large dataset
|
|
+ Added RepeatTestRule to tests that are flaky
+ Added Controls.Builder to create controls string in tests
+ Added @Ignore to failing tests (filed JIRAs)
Other fixes:
+ Added @Override to ScanBatch#close to avoid potential bugs
+ Added docs link in ProtobufLengthDecoder
+ Fixed logging issue in CountDownLatchImpl
|
|
to log from the correct class
+ Fixed docs in UserException
+ Created loggers, and changed logger visibility to private
|
|
+ DRILL-2867: Add ControlsValidator to VALIDATORS only if assertions are enabled
+ return in ExecutionControls ctor if assertions are not enabled
+ added InjectorFactory class to align with the logger pattern
|
|
cancellation.
|
|
Changed the cleanup handling at the end of ImplCreator.getExec(), and
handle the newly returned null value in FragmentExecutor.run().
|
|
fragment start.
|
|
|
|
|
|
cancellation changes
Execution: In WorkManager,
+ swap implementations of startFragmentPendingRemote() and addFragmentRunner()
+ warn if there are running fragments in close()
Cancellation:
+ for fragments waiting on data, delegate cancellations to WorkEventBus (in Foreman and ControlMessageHandler)
+ documentation
|
|
- Interrupt FragmentExecutor thread as part of FragmentExecutor.cancel()
- Handle InterruptedException in ExternalSortBatch.newSV2(). If the fragment status says
should not continue, then throw the InterruptedException to caller which returns IterOutcome.STOP
- Add comments reg not handling of InterruptedException in SendingAccountor.waitForSendComplete()
- Handle InterruptedException in OrderedPartitionRecordBatch.getPartitionVectors()
If interrupted in Thread.sleep calls and fragment status says should not run, then
return IterOutcome.STOP downstream.
- Interrupt partitioner threads if PartitionerRecordBatch is interrupted while waiting for
partitioner threads to complete.
- Preserve interrupt status if not handled
- Handle null RecordBatches returned by RawBatchBuffer.getNext() in MergingRecordBatch.buildSchema()
- Change timeout in Foreman to be proportional to the number of intermediate fragments sent instead
of hard coded limit of 90s.
- Change TimedRunnable to enforce a timeout of 15s per runnable.
Total timeout is (5s * numOfRunnableTasks) / parallelism.
- Add unit tests
* Testing cancelling a query interrupts the query fragments which are currently blocked
* Testing interrupting the partitioner sender which in turn interrupts its helper threads
* Testing TimedRunanble enforeces timeout for the whole task list.
|
|
sends a resume signal to UserServer. UserServer triggers a resume call in the correct Foreman. Foreman resumes all pauses related to the query through the Control layer.
+ Better error messages and more tests in TestDrillbitResilience and TestPauseInjection
+ Added execution controls to operator context
+ Removed ControlMessageHandler interface, renamed ControlHandlerImpl to ControlMessageHandler
+ Added CountDownLatchInjection, useful in cases like ParititionedSender that spawns multiple threads
|
|
exception happens in the Foreman before the fragment executor starts running
|
|
cancellations
includes:
DRILL-2816: system error does not display the original Exception message
DRILL-2893: ScanBatch throws a NullPointerException instead of returning OUT_OF_MEMORY
DRILL-2894: FixedValueVectors shouldn't set it's data buffer to null when it fails to allocate it
DRILL-2895: AbstractRecordBatch.buildSchema() should properly handle OUT_OF_MEMORY outcome
DRILL-2905: RootExec implementations should properly handle IterOutcome.OUT_OF_MEMORY
DRILL-2920: properly handle OutOfMemoryException
DRILL-2947: AllocationHelper.allocateNew() doesn't have a consistent behavior when it can't allocate
also:
- added UserException.memoryError() with a pre assigned error message
- injection site in ScanBatch and unit test that runs various tpch queries and injects
an exception in the ScanBatch that will cause an OUT_OF_MEMORY outcome to be sent
|
|
- Update Large Buffer allocation so Drill releases immediately rather than waiting for Garbage Collection
- Remove DrillBuf.wrap() and all references to it.
- Update Parquet Reader to reduce object churn and indirection.
- Add additional metric to memory iterator
- Add Large and small buffer metric historgram tracking
- Add memory tracking reporter
- Update Netty to 4.0.27
|
|
- Remove cleanup method from RecordBatch interface
- Make OperatorContext creation and closing the management of FragmentContext
- Make OperatorContext an abstract class and the impl only available to FragmentContext
- Make RecordBatch closing the responsibility of the RootExec
- Make all closes be suppresing closes to maximize memory release in failure
- Add new CloseableRecordBatch interface used by RootExec
- Make RootExec AutoCloseable
- Update RecordBatchCreator to return CloseableRecordBatches so that RootExec can maintain list
- Generate list of operators through change in ImplCreator
|
|
report instead of terminating thread.
|
|
|
|
across schemas
|
|
Drill
+ Controls are fired only if assertions are enabled
+ Controls can be introduced in any class that has access to FragmentContext/QueryContext
+ Controls can be fired by altering the DRILLBIT_CONTROL_INJECTIONS session option
+ Renames: SimulatedExceptions => ExecutionControls, ExceptionInjector => ExecutionControlsInjector
+ Added injection sites in Foreman, DrillSqlWorker, FragmentExecutor
+ Unit tests in TestDrillbitResilience, TestExceptionInjection and TestPauseInjection
Other commits included:
+ DRILL-2437: Moved ExecutionControls from DrillbitContext to FragmentContext/QueryContext
+ DRILL-2382: Added address and port to Injection to specify drillbit
+ DRILL-2384: Added QueryState to SingleRowListener and assert that state is COMPLETED while testing
Other edits:
+ Support for short lived session options in SessionOptionManager (using TTL in OptionValidator)
+ Introduced query count in UserSession
+ Added QueryState to queryCompleted() in UserResultsListener to check if COMPLETED/CANCELED
+ Added JSONStringValidator to TypeValidators
+ Log query id as string in DrillClient, WorkEventBus, QueryResultHandler
+ Use try..catch block only around else clause for OptionList in FragmentContext
+ Fixed drillbitContext spelling error in QueryContext
+ Fixed state transition when cancel() before run() in FragmentExecutor
+ Do not call setLocalOption twice in FallbackOptionManager
+ Show explicitly that submitWork() returns queryId in UserServer
+ Updated protocol/readme.txt to include an alternative way to generate sources
|
|
DeferredException
- Add new throwAndClear operation on to allow checking for exceptions preClose in FragmentContext
- Add new getAndClear operation
BufferManager
- Ensure close() can be called multiple times by clearing managed buffer list on close().
FragmentContext/FragmentExecutor
- Update FragmentContext to have a preClose so that we can check closure state before doing final close.
- Update so that there is only a single state maintained between FragmentContext and FragmentExecutor
- Clean up FragmentExecutor run() method to better manage error states and have only single terminal point (avoiding multiple messages to Foreman).
- Add new CANCELLATION_REQUESTED state for FragmentState.
- Move all users of isCancelled or isFailed in main code to use shouldContinue()
- Update receivingFragmentFinished message to not cancel fragment (only inform root operator of cancellation)
WorkManager Updates
- Add new afterExecute command to the WorkManager ExecutorService so that we get log entries if a thread leaks an exception. (Otherwise logs don't show these exceptions and they only go to standard out.)
Profile Page
- Update profile page to show last update and last progress.
- Change durations to non-time presentation
Foreman/QueryManager
- Extract listenable interfaces into anonymous inner classes from body of Foreman
QueryManager
- Update QueryManager to track completed nodes rather than completed fragments using NodeTracker
- Update DrillbitStatusListener to decrement expected completion messages on Nodes that have died to avoid query hang when a node dies
FragmentData/MinorFragmentProfile
- Add ability to track last status update as well as last time fragment made progress
AbstractRecordBatch
- Update awareness of current cancellation state to avoid cancellation delays
Misc. Other changes
- Move ByteCode optimization code to only record assembly and code as trace messages
- Update SimpleRootExec to create fake ExecutorState to make existing tests work.
- Update sort to exit prematurely in the case that the fragment was asked to cancel.
- Add finals to all edited files.
- Modify control handler and FragmentManager to directly support receivingFragmentFinished
- Update receiver propagation message to avoid premature removal of fragment manager
- Update UserException.Builder to log a message if we're creating a new UserException (ERROR for System, INFO otherwise).
- Update Profile pages to use min and max instead of sorts.
|
|
errors are reported to the user
Added missing changes from committed patch
|
|
reported to the user
|
|
regardless of current state
FragmentExecutor:
- Changed cancel() to behave asynchronously, and for the cancelation request to
be checked at an appropriate place in the run() loop.
|