Commit Graph

21 Commits

Author SHA1 Message Date
Zhe Wang
ae2ee24a40 A BulkLoad Job Should Use One Range Lock (#12232)
* bulkload job should use one range lock

* fix ctest

* update cli
2025-07-14 11:40:19 -07:00
Zhe Wang
9c1ae19021 speed up range lock clear for bulkload cancellation (#12227) 2025-07-10 12:00:00 -07:00
Zhe Wang
14bec9cefa Fix assertion failure in fdbcli (#12095) 2025-04-21 17:22:20 -07:00
Zhe Wang
ee0cd50c7c Reject Range Lock/Unlock Requests with Conflicting Range, User, or Lock Type (#12047)
* fix range lock

* make bulkload workload correct

* fix bugs and improve test coverage

* nits

* address comments

* nits

* address comments
2025-03-21 16:59:52 -07:00
Zhe Wang
a345d66ec3 A Couple of Fixes and Improvements for BulkLoad/Dump (#12040) 2025-03-19 09:00:36 -07:00
Zhe Wang
6ae46b4917 BulkLoadJob Should Not Schedule Completed BulkLoadTask (#12030)
* make bulkload job manager logic clear

* bypass task if the task has been completed

* improve scheduleBulkLoadJob
2025-03-14 14:52:33 -07:00
Zhe Wang
9f5fdd0bea Add BulkLoad Task Count to BulkLoad FDBCLI Command (#12029)
* change a event name

* add bulkload task count to fdbcli

* nit
2025-03-13 21:07:47 -07:00
Michael Stack
74f447cbd9 More cleanup of bulk* cli (#12015)
Tighten up options for bulk*. Compound 'local' and 'blobstore' as 'dump'/'load'. Ditto for 'history'.

Make it so 'bulkload mode' works like 'bulkdump mode': i.e. dumps current mode.

If mode is not on for bulk*, ERROR in same manner as for writemode.

Make it so we can return bulk* subcommand specific help rather than dump all help when an issue.

Make the commands match in the ctest
2025-03-13 13:49:53 -07:00
Zhe Wang
10fecd0a4e Add Error Message To BulkLoadJob Metadata (#12024)
* add error message to bulkload metadata

* remove TODOs and add error message for bulkload job manifest map creation failures

* nits
2025-03-13 10:02:39 -07:00
Michael Stack
6ee6e0bd7f Edit of bulkload/bulkdump cli. (#12012)
* fdbcli/BulkDumpCommand.actor.cpp
* fdbcli/BulkLoadCommand.actor.cpp
 Print out the bulkdump description rather than usage so user
 has a chance of figuring out what it is they entered incorrectly.
 Make bulkdump and bulkload align by using 'cancel' instead of
 'clear' in both and ordering the sub-commands the same for
 bulkload and bulkdump.  Add more help to the description.
 Bulkload was missing mention of the jobid needed
 specifying a bulkload.
* documentation/sphinx/source/bulkdump.rst
 s/clearBulkDumpJob/cancelBulkDumpJob/

Co-authored-by: stack <stack@duboce.com>
2025-03-11 08:52:13 -07:00
Zhe Wang
8142ebd029 Add BulkLoad History (#11992)
* add bulkload history

* address comments

* address comments
2025-03-04 18:50:08 -08:00
Zhe Wang
8da2a54f4d Add BulkloadJob Cancellation (#11976)
* add bulkload cancellation

* reduce frequency of job cancellation in tests

* fix bulkload assert failure

* nits

* fix busy loop in bulkload/dump workload

* fix workload

* but

* address comments and CI failures

* add task count trace event
2025-02-27 20:34:53 +00:00
Zhe Wang
2116547ad3 Improve BulkDump Implementation (#11974)
* bulkdump code refactor

* fix bugs

* improve
2025-02-26 13:58:45 -08:00
Zhe Wang
5cce92dcac Simplify BulkLoad Job Metadata (#11959)
* address comments in the PR 11952

* code refactor and simplification

* avoid task outdated in DDBulkLoadJobExecute

* nit

* fix CI issue
2025-02-25 10:57:22 -08:00
michael stack
aea37ae90d Use s3 if available when running the bulkload test.
It was disabled until we made it so the SS could
talk to s3, included in this PR.

Also finished the bulkload test. It only had the
bulkdump portion. bulkload support was recentlty
added so finish off the test here by adding bulkload
of the bulkdump and then verifying all data present.

Added passing knobs to the fdb cluster so available to the
fdbserver when it goes to talk to s3. Also added passing
SS count to start in fdb cluster.

* fdbclient/tests/fdb_cluster_fixture.sh
 Add ability to pass multiple knobs to fdb cluster
 and to specify more than just one SS.

* fdbserver/fdbserver.actor.cpp
 Add --blob-server option and processing of FDB_BLOB_CREDENTIALS
 if present (hijacked the unused, unadvertised --
   blob-credentials-file).

* tests/loopback_cluster/run_custom_cluster.sh
 Allow passing more than just one knob.

* fdbclient/BulkLoading.cpp
* fdbclient/include/fdbclient/BulkLoading.h
 Added getPath

* fdbclient/S3BlobStore.actor.cpp
 Fix bug where we were doubling up the first '/' on a path if
 it had a root '/' already (s3 treats /a/b as distinct from
 /a//b).

* fdbclient/S3Client.actor.cpp
 Fix up of traceevent Types.

* fdbclient/tests/bulkload_test.sh
 Enable being able to use s3 if available.
 Pick up jobid when bulkdumping. Feed it to new bulkload
 method. Add verification all data present post-bulkload.

* fdbserver/BulkLoadUtil.actor.cpp
 Add support for blobstore.

* tests/loopback_cluster/run_custom_cluster.sh
 Bug fix -- we were only able to pass in one knob. Allow
 passing multiple.
2025-01-17 17:29:56 -08:00
Zhe Wang
9195f78bec Bulkload FDBCLI Command (#11886) 2025-01-15 09:27:59 -08:00
Zhe Wang
cf7c8f41b2 BulkLoad Job Framework and Co-Testing BulkLoad and BulkDump (#11865)
* add bulkload job framework and fix bugs

* add BulkLoadChecksum, fix CI issue

* nits

* nits

* address comments

* mitigate perpetual wiggle to make sure DD can select a valid team to inject data

* fix submitBulkDumpJob and submitBulkLoadJob

* change remoteRoot to jobRoot

* add comments
2025-01-14 11:28:42 -08:00
Zhe Wang
d3532e4478 Improve BulkLoad/Dump implementation (#11842)
* Improve BulkLoad/Dump implementation

* make bulkload test data folder inside simfdb folder

* simplify code

* use manifest in bulkdump metadata

* use manifest in bulkload

* apply bulkload fileset to bulkload and fix bugs of bytesampling value generation

* remove BulkDumpFileFullPathSet

* address comments

* address comments

* address comments
2025-01-06 13:02:23 -08:00
Zhe Wang
27253a5aca address comments 2024-12-09 15:04:31 -08:00
Sepeth
2a82f22fe5 Fix warnings for long long or int64_t format specifiers by switching to fmt::print* (#11574) 2024-09-12 12:10:40 -07:00
Zhe Wang
74990e44bd Bulk Loading Framework (#11369) 2024-07-23 14:57:28 -07:00