Age | Commit message (Collapse) | Author |
|
and allocated size
|
|
Queues.
closes #1677
|
|
does not return values is called
- Fix check for return function value to handle the case when created object is returned without assigning it to the local variable
closes #1687
|
|
generated code to allow scalar replacement for more cases
closes #1686
|
|
references inside if block
|
|
- Added new optimizer rule which checks if query references directory columns only and has DISTINCT or GROUP BY operation. If the condition holds, instead of scanning full file set the following will be performed:
1) if there is cache metadata file, these directories will be read from it,
2) otherwise directories will be gathered from selection object (PartitionLocation).
In the end Scan node will be transformed to DrillValuesRel (containing constant literals) with gathered values so no scan will be performed.
closes #1640
|
|
closes #1666
|
|
1. Renamed map to struct in schema parser.
2. Updated sqlTypeOf function to return STRUCT instead of MAP, drillTypeOf function will return MAP as before until internal renaming is done.
3. Add is_struct alias to already existing is_map function. Function should be revisited once Drill supports true maps.
4. Updated unit tests.
closes #1688
|
|
|
|
closes #1685
|
|
1. Add format, default, column properties logic.
2. Changed schema JSON after serialization.
3. Added appropriate unit tests.
closes #1684
|
|
The result set loader allows controlling batch sizes. The new scan framework
built on top of that framework handles projection, implicit columns, null
columns and more. This commit converts the "new" ("compliant") text reader
to use the new framework. Options select the use of the V2 ("new") or V3
(row-set based) versions. Unit tests demonstrate V3 functionality.
closes #1683
|
|
closes #1652
|
|
closes #1665
|
|
port is used
closes #1656
|
|
closes #1667
|
|
closes #1674
|
|
- replaced all String path representation with org.apache.hadoop.fs.Path
- added PathSerDe.Se JSON serializer
- refactoring of DFSPartitionLocation code by leveraging existing listPartitionValues() functionality
closes #1657
|
|
Roll-up of fixes an enhancements that emerged from the effort to host the CSV reader on the new framework.
closes #1676
|
|
1. Updated protobuf to version 3.6.1
2. Added protobuf to the root pom dependency management
3. Added classes BoundedByteString and LiteralByteString for compatibility with HBase
4. Added ProtobufPatcher to provide compatibility with MapR-DB and HBase
closes #1639
|
|
'ALLOW_BACKSLASH_ESCAPING_ANY_CHARACTER' (#1663)
|
|
implemented
- Implemented 'repeated_count' function for repeated MAP and repeated LIST;
- Updated RepeatedListReader and RepeatedMapReader implementations to return correct value from size() method
- Moved repeated_count to freemarker template and added support for more repeated types for the function
closes #1641
|
|
- Made `NullExpression`s in `IfExpression` with nested `IfExpression`s to be rewritten to typed ones recursively if necessary
closes #1668
|
|
|
|
Add support for avg row-width and major type statistics.
Parallelize the ANALYZE implementation and stats UDF implementation to improve stats collection performance.
Update/fix rowcount, selectivity and ndv computations to improve plan costing.
Add options for configuring collection/usage of statistics.
Add new APIs and implementation for stats writer (as a precursor to Drill Metastore APIs).
Fix several stats/costing related issues identified while running TPC-H nad TPC-DS queries.
Add support for CPU sampling and nested scalar columns.
Add more testcases for collection and usage of statistics and fix remaining unit/functional test failures.
Thanks to Venki Korukanti (@vkorukanti) for the description below (modified to account for new changes). He graciously agreed to rebase the patch to latest master, fixed few issues and added few tests.
FUNCS: Statistics functions as UDFs:
Separate
Currently using FieldReader to ensure consistent output type so that Unpivot doesn't get confused. All stats columns should be Nullable, so that stats functions can return NULL when N/A.
* custom versions of "count" that always return BigInt
* HyperLogLog based NDV that returns BigInt that works only on VarChars
* HyperLogLog with binary output that only works on VarChars
OPS: Updated protobufs for new ops
OPS: Implemented StatisticsMerge
OPS: Implemented StatisticsUnpivot
ANALYZE: AnalyzeTable functionality
* JavaCC syntax more-or-less copied from LucidDB.
* (Basic) AnalyzePrule: DrillAnalyzeRel -> UnpivotPrel StatsMergePrel FilterPrel(for sampling) StatsAggPrel ScanPrel
ANALYZE: Add getMetadataTable() to AbstractSchema
USAGE: Change field access in QueryWrapper
USAGE: Add getDrillTable() to DrillScanRelBase and ScanPrel
* since ScanPrel does not inherit from DrillScanRelBase, this requires adding a DrillTable to the constructor
* This is done so that a custom ReflectiveRelMetadataProvider can access the DrillTable associated with Logical/Physical scans.
USAGE: Attach DrillStatsTable to DrillTable.
* DrillStatsTable represents the data scanned from a corresponding ".stats.drill" table
* In order to avoid doing query execution right after the ".stats.drill" table is found, metadata is not actually collected until the MaterializationVisitor is used.
** Currently, the metadata source must be a string (so that a SQL query can be created). Doing this with a table is probably more complicated.
** Query is set up to extract only the most recent statistics results for each column.
closes #729
|
|
|
|
Adds the "plumbing" that connects the scan operator to the result set loader and the scan projection framework. See the various package-info.java files for the technical datails. Also adds a large number of tests.
This PR does not yet introduce an actual scan operator: that will follow in subsequent PRs.
closes #1618
|
|
closes #1644
This PR standardizes error and alert messages to a cleaner interface by leveraging Bootstraps UX elements for publishing the messages in a presentable format.
Exceptions reported back to the browser and rendered in a neat tabular format (using Panels)
All errors can be redirected to errorMessage.ftl which will render it in a neat format.
Alerts are replaced with modals.
Interactions (pages) affected by Alert modals
1. Missing Query submission
2. profile Query Rerun
3. invalid Profile Listing Fetch
4. invalid Option Value for update
5. Missing username/password submission
The errorMessage.ftl has been moved to root dir, and unused `error.ftl` was removed
|
|
metadata auto-refresh
closes #1638
|
|
DRILL-7006 added a type conversion "shim" within the row set framework. Basically, we insert a "shim" column writer that takes data in one form (String, say), and does reader-specific conversions to a target format (INT, say).
The code works fine, but the shim class ends up needing to override a bunch of methods which it then passes along to the base writer. This PR refactors the code so that the conversion shim is simpler.
closes #1633
|
|
1. Moved Calcite dependency from profile hadoop-default to general dependency managment
2. Updated Calcite version to 1.18.0-drill-r0 and Avatica version to 1.13.0
3. Hook.REL_BUILDER_SIMPLIFY moved to static block, cause now it can't be removed (fixes DRILL-6830)
4. Removed WrappedAccessor, since it was workaround fixed in CALCITE-1408
5. Fixed setting of multiple options in TestBuilder
6. Timstampadd type inference aligned with CALCITE-2699
7. Dependency update caused 417 kB increase of jdb-all jar size, so the maxsize limit was
increased from 39.5 to 40 MB
8. Added test into TestDrillParquetReader to ensure that DRILL-6856 was
fixed by Calcite update
close apache/drill#1631
|
|
and filter condition is swapped
close apache/drill#1628
|
|
close apache/drill#1629
|
|
Note: this PR only adds support for CREATE / DROP SCHEMA commands which allow to store and delete schema. Schema usage during querying the data will be covered in other PRs.
1. Added parser methods / handles to parse CREATE / DROP schema commands.
2. Added SchemaProviders classes to separate ways of schema provision (file, table function).
3. Added schema parsing using ANTLR4 (lexer, parser, visitors).
4. Added appropriate unit tests.
close apache/drill#1615
|
|
close apache/drill#1630
|
|
ShutdownThread is no longer required when Drillbit#close() is called.
mvn install for Drill project consumed 600MiB (there were 160 shutdown hooks)
close apache/drill#1625
|
|
close apache/drill#1622
|
|
close apache/drill#1620
|
|
Modifies the column metadata and writer abstractions to allow a type conversion "shim" to be specified as part of the schema, then inserted as part of the row set writer. Allows, say, setting an Int or Date from a string, parsing the string to obtain the proper data type to store in the vector.
Type conversion not yet supported in the result set loader: some additional complexity needs to be resolved.
Adds unit tests for this functionality. Refactors some existing tests to remove rough edges.
closes #1623
|
|
not complete
closes #1621
|
|
coalesce exist in a parquet file
- Updated UntypedNullVector to hold value count when vector is allocated and transfered to another one;
- Updated RecordBatchLoader and DrillCursor to handle case when only UntypedNull values are present in RecordBatch (special case when data buffer is null but actual values are present);
- Added functions to cast UntypedNull value to other types for use in UDFs;
- Moved UntypedReader, UntypedHolderReaderImpl and UntypedReaderImpl from org.apache.drill.exec.vector.complex.impl to org.apache.drill.exec.vector package.
closes #1614
|
|
closes #1600
|
|
|
|
closes #1619
|
|
instead of ValueHolder
closes #1617
|
|
This provides an option to order the list of query profiles based on any of the displayed fields, including total duration. This way, a user can easily identify long running queries.
In addition, the number of profiles listed per page for both, completed and running list of queries, has been made configurable with the parameter: `drill.exec.http.profiles_per_page` (default is 10,25,50,100)
closes #1594
|
|
binary table
1. Added persistence of MAP key and value types in Drill views (affects .view.drill file) for avoiding cast problems in future.
2. Preserved backward compatibility of older view files by treating untyped maps as ANY.
closes #1602
|
|
plugin when native reader is enabled
closes #1610
|
|
|
|
1. ColumnBuilder: setPrecisionAndScale method
2. SchemaContainer: addColumn method parameter AbstractColumnMetadata was changed to ColumnMetadata
3. MapBuilder / RepeatedListBuilder / UnionBuilder: added constructors without parent, made buildColumn method public
4. TupleMetadata: added toMetadataList method
5. Other refactoring
|