Overview¶
b2sdk is a client library for easy access to all of the capabilities of B2 Cloud Storage.
B2 command-line tool is an example of how it can be used to provide command-line access to the B2 service, but there are many possible applications (including FUSE filesystems, storage backend drivers for backup applications etc).
Why use b2sdk?¶
When building an application which uses B2 cloud, it is possible to implement an independent B2 API client, but using b2sdk allows for:
reuse of code that is already written, with hundreds of unit tests
use Syncronizer, a high-performance, parallel rsync-like utility
developer-friendly library api version policy which guards your program against incompatible changes
B2 integration checklist is passed automatically
raw_simulator makes it easy to mock the B2 cloud for unit testing purposes
progress of operations will be reported to an object of your choice
exception hierarchy makes it easy to display informative messages to users
interrupted transfers are automatically continued
b2sdk has been developed for 3 years before it version 1.0.0 was released. It’s stable and mature.
Documentation index¶
Installation Guide¶
Installing as a dependency¶
b2sdk can simply be added to requirements.txt
(or equivalent such as setup.py
, .pipfile
etc).
In order to properly set a dependency, see versioning chapter for details.
Note
The stability of your application depends on correct pinning of versions.
Installing a development version¶
To install b2sdk, checkout the repository and run:
pip install b2sdk
in your python environment.
Note
If you see a message saying that the six
library cannot be installed, which
happens if you’re installing with the system python on OS X El Capitan, try this:
pip install --ignore-installed b2sdk
Installing for contributors¶
You’ll need to some Python packages installed. To get all the latest things:
pip install --upgrade --upgrade-strategy eager -r requirements.txt -r requirements-test.txt -r requirements-setup.txt
There is a Makefile with a rule to run the unit tests using the currently active Python:
make setup
make test
will install the required packages, then run the unit tests.
Tutorial¶
AccountInfo¶
AccountInfo
object holds information about access keys, tokens, upload urls, as well as a bucket id-name map.
It is the first object that you need to create to use b2sdk. Using AccountInfo
, we’ll be able to create a B2Api
object to manage a B2 account.
In the tutorial we will use b2sdk.v1.InMemoryAccountInfo
:
>>> from b2sdk.v1 import InMemoryAccountInfo
>>> info = InMemoryAccountInfo() # store credentials, tokens and cache in memory
With the info
object in hand, we can now proceed to create a B2Api
object.
Note
AccountInfo section provides guidance for choosing the correct AccountInfo
class for your application.
Account authorization¶
>>> application_key_id = '4a5b6c7d8e9f'
>>> application_key = '001b8e23c26ff6efb941e237deb182b9599a84bef7'
>>> b2_api.authorize_account("production", application_key_id, application_key)
Tip
Get credentials from B2 website
To find out more about account authorization, see b2sdk.v1.B2Api.authorize_account()
B2Api¶
B2Api allows for account-level operations on a B2 account.
Typical B2Api operations¶
Perform account authorization. |
|
Create a bucket. |
|
Delete a chosen bucket. |
|
Call |
|
Return the Bucket matching the given bucket_name. |
|
Create a new application key. |
|
List application keys. |
|
Delete application key. |
|
Download a file with the given ID. |
|
Generator that yields a |
|
Cancel a large file upload. |
>>> b2_api = B2Api(info)
to find out more, see b2sdk.v1.B2Api
.
The most practical operation on B2Api
object is b2sdk.v1.B2Api.get_bucket_by_name()
.
Bucket allows for operations such as listing a remote bucket or transferring files.
Bucket¶
Initializing a Bucket¶
Retrieve an existing Bucket¶
To get a Bucket
object for an existing B2 Bucket:
>>> b2_api.get_bucket_by_name("example-mybucket-b2-1",)
Bucket<346501784642eb3e60980d10,example-mybucket-b2-1,allPublic>
Create a new Bucket¶
To create a bucket:
>>> bucket_name = 'example-mybucket-b2-1'
>>> bucket_type = 'allPublic' # or 'allPrivate'
>>> b2_api.create_bucket(bucket_name, bucket_type)
Bucket<346501784642eb3e60980d10,example-mybucket-b2-1,allPublic>
You can optionally store bucket info, CORS rules and lifecycle rules with the bucket. See b2sdk.v1.B2Api.create_bucket()
for more details.
Note
Bucket name must be unique in B2 (across all accounts!). Your application should be able to cope with a bucket name collision with another B2 user.
Typical Bucket operations¶
Download a file by name. |
|
Upload a file on local disk to a B2 file. |
|
Upload bytes in memory to a B2 file. |
|
Pretend that folders exist and yields the information about the files in a folder. |
|
Hide a file. |
|
Delete a file version. |
|
Return an authorization token that is valid only for downloading files from the given bucket. |
|
Get file download URL. |
|
Update various bucket parameters. |
|
Update bucket type. |
|
Update bucket info. |
To find out more, see b2sdk.v1.Bucket
.
Summary¶
You now know how to use AccountInfo
, B2Api
and Bucket
objects.
To see examples of some of the methods presented above, visit the quick start guide section.
Quick Start Guide¶
Prepare b2sdk¶
>>> from b2sdk.v1 import *
>>> info = InMemoryAccountInfo()
>>> b2_api = B2Api(info)
>>> application_key_id = '4a5b6c7d8e9f'
>>> application_key = '001b8e23c26ff6efb941e237deb182b9599a84bef7'
>>> b2_api.authorize_account("production", application_key_id, application_key)
Tip
Get credentials from B2 website
Synchronization¶
>>> from b2sdk.v1 import ScanPoliciesManager
>>> from b2sdk.v1 import parse_sync_folder
>>> from b2sdk.v1 import Synchronizer
>>> import time
>>> import sys
>>> source = '/home/user1/b2_example'
>>> destination = 'b2://example-mybucket-b2'
>>> source = parse_sync_folder(source, b2_api)
>>> destination = parse_sync_folder(destination, b2_api)
>>> policies_manager = ScanPoliciesManager(exclude_all_symlinks=True)
>>> synchronizer = Synchronizer(
max_workers=10,
policies_manager=policies_manager,
dry_run=False,
allow_empty_source=True,
)
>>> no_progress = False
>>> with SyncReport(sys.stdout, no_progress) as reporter:
synchronizer.sync_folders(
source_folder=source,
dest_folder=destination,
now_millis=int(round(time.time() * 1000)),
reporter=reporter,
)
upload some.pdf
upload som2.pdf
Tip
Sync is the preferred way of getting data into and out of B2 cloud, because it can achieve highest performance due to parallelization of scanning and data transfer operations.
To learn more about sync, see Synchronizer.
Bucket actions¶
List buckets¶
>>> b2_api.list_buckets()
[Bucket<346501784642eb3e60980d10,example-mybucket-b2-1,allPublic>]
>>> for b in b2_api.list_buckets():
print('%s %-10s %s' % (b.id_, b.type_, b.name))
346501784642eb3e60980d10 allPublic example-mybucket-b2-1
Create a bucket¶
>>> bucket_name = 'example-mybucket-b2-1' # must be unique in B2 (across all accounts!)
>>> bucket_type = 'allPublic' # or 'allPrivate'
>>> b2_api.create_bucket(bucket_name, bucket_type)
Bucket<346501784642eb3e60980d10,example-mybucket-b2-1,allPublic>
You can optionally store bucket info, CORS rules and lifecycle rules with the bucket. See b2sdk.v1.B2Api.create_bucket()
.
Delete a bucket¶
>>> bucket_name = 'example-mybucket-b2-to-delete'
>>> bucket = b2_api.get_bucket_by_name(bucket_name)
>>> b2_api.delete_bucket(bucket)
returns None if successful, raises an exception in case of error.
Update bucket info¶
>>> new_bucket_type = 'allPrivate'
>>> bucket_name = 'example-mybucket-b2'
>>> bucket = b2_api.get_bucket_by_name(bucket_name)
>>> bucket.update(bucket_type=new_bucket_type)
{'accountId': '451862be08d0',
'bucketId': '5485a1682662eb3e60980d10',
'bucketInfo': {},
'bucketName': 'example-mybucket-b2',
'bucketType': 'allPrivate',
'corsRules': [],
'lifecycleRules': [],
'revision': 3}
For more information see b2sdk.v1.Bucket.update()
.
File actions¶
Tip
Sync is the preferred way of getting files into and out of B2 cloud, because it can achieve highest performance due to parallelization of scanning and data transfer operations.
To learn more about sync, see Sync.
Use the functions described below only if you really need to transfer a single file.
Upload file¶
>>> local_file_path = '/home/user1/b2_example/new.pdf'
>>> b2_file_name = 'dummy_new.pdf'
>>> file_info = {'how': 'good-file'}
>>> bucket = b2_api.get_bucket_by_name(bucket_name)
>>> bucket.upload_local_file(
local_file=local_file_path,
file_name=b2_file_name,
file_infos=file_info,
)
<b2sdk.file_version.FileVersionInfo at 0x7fc8cd560550>
This will work regardless of the size of the file - upload_local_file
automatically uses large file upload API when necessary.
For more information see b2sdk.v1.Bucket.upload_local_file()
.
Download file¶
By id¶
>>> from b2sdk.v1 import DownloadDestLocalFile
>>> from b2sdk.v1 import DoNothingProgressListener
>>> local_file_path = '/home/user1/b2_example/new2.pdf'
>>> file_id = '4_z5485a1682662eb3e60980d10_f1195145f42952533_d20190403_m130258_c002_v0001111_t0002'
>>> download_dest = DownloadDestLocalFile(local_file_path)
>>> progress_listener = DoNothingProgressListener()
>>> b2_api.download_file_by_id(file_id, download_dest, progress_listener)
{'fileId': '4_z5485a1682662eb3e60980d10_f1195145f42952533_d20190403_m130258_c002_v0001111_t0002',
'fileName': 'som2.pdf',
'contentType': 'application/pdf',
'contentLength': 1870579,
'contentSha1': 'd821849a70922e87c2b0786c0be7266b89d87df0',
'fileInfo': {'src_last_modified_millis': '1550988084299'}}
>>> print('File name: ', download_dest.file_name)
File name: som2.pdf
>>> print('File id: ', download_dest.file_id)
File id: 4_z5485a1682662eb3e60980d10_f1195145f42952533_d20190403_m130258_c002_v0001111_t0002
>>> print('File size: ', download_dest.content_length)
File size: 1870579
>>> print('Content type:', download_dest.content_type)
Content type: application/pdf
>>> print('Content sha1:', download_dest.content_sha1)
Content sha1: d821849a70922e87c2b0786c0be7266b89d87df0
By name¶
>>> bucket = b2_api.get_bucket_by_name(bucket_name)
>>> b2_file_name = 'dummy_new.pdf'
>>> local_file_name = '/home/user1/b2_example/new3.pdf'
>>> download_dest = DownloadDestLocalFile(local_file_name)
>>> bucket.download_file_by_name(b2_file_name, download_dest)
{'fileId': '4_z5485a1682662eb3e60980d10_f113f963288e711a6_d20190404_m065910_c002_v0001095_t0044',
'fileName': 'dummy_new.pdf',
'contentType': 'application/pdf',
'contentLength': 1870579,
'contentSha1': 'd821849a70922e87c2b0786c0be7266b89d87df0',
'fileInfo': {'how': 'good-file'}}
List files¶
>>> bucket_name = 'example-mybucket-b2'
>>> bucket = b2_api.get_bucket_by_name(bucket_name)
>>> for file_info, folder_name in bucket.ls(show_versions=False):
>>> print(file_info.file_name, file_info.upload_timestamp, folder_name)
f2.txt 1560927489000 None
som2.pdf 1554296578000 None
some.pdf 1554296579000 None
test-folder/.bzEmpty 1561005295000 test-folder/
# Recursive
>>> bucket_name = 'example-mybucket-b2'
>>> bucket = b2_api.get_bucket_by_name(bucket_name)
>>> for file_info, folder_name in bucket.ls(show_versions=False, recursive=True):
>>> print(file_info.file_name, file_info.upload_timestamp, folder_name)
f2.txt 1560927489000 None
som2.pdf 1554296578000 None
some.pdf 1554296579000 None
test-folder/.bzEmpty 1561005295000 test-folder/
test-folder/folder_file.txt 1561005349000 None
Note: The files are returned recursively and in order so all files in a folder are printed one after another. The folder_name is returned only for the first file in the folder.
# Within folder
>>> bucket_name = 'example-mybucket-b2'
>>> bucket = b2_api.get_bucket_by_name(bucket_name)
>>> for file_info, folder_name in bucket.ls(folder_to_list='test-folder', show_versions=False):
>>> print(file_info.file_name, file_info.upload_timestamp, folder_name)
test-folder/.bzEmpty 1561005295000 None
test-folder/folder_file.txt 1561005349000 None
# list file versions
>>> for file_info, folder_name in bucket.ls(show_versions=True):
>>> print(file_info.file_name, file_info.upload_timestamp, folder_name)
f2.txt 1560927489000 None
f2.txt 1560849524000 None
som2.pdf 1554296578000 None
some.pdf 1554296579000 None
For more information see b2sdk.v1.Bucket.ls()
.
Get file metadata¶
>>> file_id = '4_z5485a1682662eb3e60980d10_f113f963288e711a6_d20190404_m065910_c002_v0001095_t0044'
>>> b2_api.get_file_info(file_id)
{'accountId': '451862be08d0',
'action': 'upload',
'bucketId': '5485a1682662eb3e60980d10',
'contentLength': 1870579,
'contentSha1': 'd821849a70922e87c2b0786c0be7266b89d87df0',
'contentType': 'application/pdf',
'fileId': '4_z5485a1682662eb3e60980d10_f113f963288e711a6_d20190404_m065910_c002_v0001095_t0044',
'fileInfo': {'how': 'good-file'},
'fileName': 'dummy_new.pdf',
'uploadTimestamp': 1554361150000}
Copy file¶
Please switch to b2sdk.v2.Bucket.copy()
.
>>> file_id = '4_z5485a1682662eb3e60980d10_f118df9ba2c5131e8_d20190619_m065809_c002_v0001126_t0040'
>>> bucket.copy(file_id, 'f2_copy.txt')
{'accountId': '451862be08d0',
'action': 'copy',
'bucketId': '5485a1682662eb3e60980d10',
'contentLength': 124,
'contentSha1': '737637702a0e41dda8b7be79c8db1d369c6eef4a',
'contentType': 'text/plain',
'fileId': '4_z5485a1682662eb3e60980d10_f1022e2320daf707f_d20190620_m122848_c002_v0001123_t0020',
'fileInfo': {'src_last_modified_millis': '1560848707000'},
'fileName': 'f2_copy.txt',
'uploadTimestamp': 1561033728000}
If the content length
is not provided and the file is larger than 5GB, copy
would not succeed and error would be raised. If length is provided, then the file may be copied as a large file. Maximum copy part size can be set by max_copy_part_size
- if not set, it will default to 5GB. If max_copy_part_size
is lower than absoluteMinimumPartSize, file would be copied in single request - this may be used to force copy in single request large file that fits in server small file limit.
If you want to copy just the part of the file, then you can specify the offset and content length:
>>> file_id = '4_z5485a1682662eb3e60980d10_f118df9ba2c5131e8_d20190619_m065809_c002_v0001126_t0040'
>>> bucket.copy(file_id, 'f2_copy.txt', offset=1024, length=2048)
Note that content length is required for offset values other than zero.
For more information see b2sdk.v1.Bucket.copy()
.
Delete file¶
>>> file_id = '4_z5485a1682662eb3e60980d10_f113f963288e711a6_d20190404_m065910_c002_v0001095_t0044'
>>> file_info = b2_api.delete_file_version(file_id, 'dummy_new.pdf')
>>> print(file_info)
{'file_id': '4_z5485a1682662eb3e60980d10_f113f963288e711a6_d20190404_m065910_c002_v0001095_t0044',
'file_name': 'dummy_new.pdf'}
Cancel large file uploads¶
>>> bucket = b2_api.get_bucket_by_name(bucket_name)
>>> for file_version in bucket.list_unfinished_large_files():
bucket.cancel_large_file(file_version.file_id)
Advanced usage patterns¶
B2 server API allows for creation of an object from existing objects. This allows to avoid transferring data from the source machine if the desired outcome can be (at least partially) constructed from what is already on the server.
The way b2sdk exposes this functonality is through a few functions that allow the user to express the desired outcome and then the library takes care of planning and executing the work. Please refer to the table below to compare the support of object creation methods for various usage patterns.
Available methods¶
Method / supported options |
Source |
Range |
Streaming |
|
---|---|---|---|---|
local |
no |
no |
automatic |
|
remote |
no |
no |
automatic |
|
any |
no |
no |
automatic |
|
any |
no |
yes |
manual |
|
any |
yes |
no |
automatic |
|
any |
yes |
yes |
manual |
Range overlap¶
Some methods support overlapping ranges between local and remote files. b2sdk tries to use the remote ranges as much as possible, but due to limitations of b2_copy_part
(specifically the minimum size of a part) that may not be always possible. A possible solutuon for such case is to download a (small) range and then upload it along with another one, to meet the b2_copy_part
requirements. This can be improved if the same data is already available locally - in such case b2sdk will use the local range rather than downloading it.
Streaming interface¶
Some object creation methods start writing data before reading the whole input (iterator). This can be used to write objects that do not have fully known contents without writing them first locally, so that they could be copied. Such usage pattern can be relevant to small devices which stream data to B2 from an external NAS, where caching large files such as media files or virtual machine images is not an option.
Please see advanced method support table to see where streaming interface is supported.
Concatenate files¶
b2sdk.v1.Bucket.concatenate()
accepts an iterable of upload sources (either local or remote). It can be used to glue remote files together, back-to-back, into a new file.
b2sdk.v1.Bucket.concatenate_stream()
does not create and validate a plan before starting the transfer, so it can be used to process a large input iterator, at a cost of limited automated continuation.
Concatenate files of known size¶
>>> bucket = b2_api.get_bucket_by_name(bucket_name)
>>> input_sources = [
... RemoteFileUploadSource('4_z5485a1682662eb3e60980d10_f113f963288e711a6_d20190404_m065910_c002_v0001095_t0044', offset=100, length=100),
... LocalUploadSource('my_local_path/to_file.txt'),
... RemoteFileUploadSource('4_z5485a1682662eb3e60980d10_f1022e2320daf707f_d20190620_m122848_c002_v0001123_t0020', length=2123456789),
... ]
>>> file_info = {'how': 'good-file'}
>>> bucket.concatenate(input_sources, remote_name, file_info)
<b2sdk.file_version.FileVersionInfo at 0x7fc8cd560551>
If one of remote source has length smaller than absoluteMinimumPartSize then it cannot be copied into large file part. Such remote source would be downloaded and concatenated locally with local source or with other downloaded remote source.
Please note that this method only allows checksum verification for local upload sources. Checksum verification for remote sources is available only when local copy is available. In such case b2sdk.v1.Bucket.create_file()
can be used with overalapping ranges in input.
For more information about concatenate
please see b2sdk.v1.Bucket.concatenate()
and b2sdk.v1.RemoteUploadSource
.
Concatenate files of known size (streamed version)¶
b2sdk.v1.Bucket.concatenate()
accepts an iterable of upload sources (either local or remote). The operation would not be planned ahead so it supports very large output objects, but continuation is only possible for local only sources and provided unfinished large file id. See more about continuation in b2sdk.v1.Bucket.create_file()
paragraph about continuation.
>>> bucket = b2_api.get_bucket_by_name(bucket_name)
>>> input_sources = [
... RemoteFileUploadSource('4_z5485a1682662eb3e60980d10_f113f963288e711a6_d20190404_m065910_c002_v0001095_t0044', offset=100, length=100),
... LocalUploadSource('my_local_path/to_file.txt'),
... RemoteFileUploadSource('4_z5485a1682662eb3e60980d10_f1022e2320daf707f_d20190620_m122848_c002_v0001123_t0020', length=2123456789),
... ]
>>> file_info = {'how': 'good-file'}
>>> bucket.concatenate_stream(input_sources, remote_name, file_info)
<b2sdk.file_version.FileVersionInfo at 0x7fc8cd560551>
Concatenate files of unknown size¶
While it is supported by B2 server, this pattern is currently not supported by b2sdk.
Synthethize an object¶
Using methods described below an object can be created from both local and remote sources while avoiding downloading small ranges when such range is already present on a local drive.
Update a file efficiently¶
b2sdk.v1.Bucket.create_file()
accepts an iterable which can contain overlapping destination ranges.
Note
Following examples create new file - data in bucket is immutable, but b2sdk can create a new file version with the same name and updated content
Append to the end of a file¶
The assumption here is that the file has been appended to since it was last uploaded to. This assumption is verified by b2sdk when possible by recalculating checksums of the overlapping remote and local ranges. If copied remote part sha does not match with locally available source, file creation process would be interrupted and an exception would be raised.
>>> bucket = b2_api.get_bucket_by_name(bucket_name)
>>> input_sources = [
... WriteIntent(
... data=RemoteFileUploadSource(
... '4_z5485a1682662eb3e60980d10_f113f963288e711a6_d20190404_m065910_c002_v0001095_t0044',
... offset=0,
... length=5000000,
... ),
... destination_offset=0,
... ),
... WriteIntent(
... data=LocalFileUploadSource('my_local_path/to_file.txt'), # of length 60000000
... destination_offset=0,
... ),
... ]
>>> file_info = {'how': 'good-file'}
>>> bucket.create_file(input_sources, remote_name, file_info)
<b2sdk.file_version.FileVersionInfo at 0x7fc8cd560552>
LocalUploadSource has the size determined automatically in this case. This is more efficient than b2sdk.v1.Bucket.concatenate()
, as it can use the overlapping ranges when a remote part is smaller than absoluteMinimumPartSize to prevent downloading a range (when concatenating, local source would have destination offset at the end of remote source)
For more information see b2sdk.v1.Bucket.create_file()
.
Change the middle of the remote file¶
>>> bucket = b2_api.get_bucket_by_name(bucket_name)
>>> input_sources = [
... WriteIntent(
... RemoteUploadSource('4_z5485a1682662eb3e60980d10_f113f963288e711a6_d20190404_m065910_c002_v0001095_t0044', offset=0, length=4000000),
... destination_offset=0,
... ),
... WriteIntent(
... LocalFileUploadSource('my_local_path/to_file.txt'), # length=1024, here not passed and just checked from local source using seek
... destination_offset=4000000,
... ),
... WriteIntent(
... RemoteUploadSource('4_z5485a1682662eb3e60980d10_f113f963288e711a6_d20190404_m065910_c002_v0001095_t0044', offset=4001024, length=123456789),
... destination_offset=4001024,
... ),
... ]
>>> file_info = {'how': 'good-file'}
>>> bucket.create_file(input_sources, remote_name, file_info)
<b2sdk.file_version.FileVersionInfo at 0x7fc8cd560552>
LocalUploadSource has the size determined automatically in this case. This is more efficient than b2sdk.v1.Bucket.concatenate()
, as it can use the overlapping ranges when a remote part is smaller than absoluteMinimumPartSize to prevent downloading a range.
For more information see b2sdk.v1.Bucket.create_file()
.
Synthetize a file from local and remote parts¶
- This is useful for expert usage patterns such as:
synthetic backup
reverse synthetic backup
mostly-server-side cutting and gluing uncompressed media files such as wav and avi with rewriting of file headers
various deduplicated backup scenarios
Please note that b2sdk.v1.Bucket.create_file_stream()
accepts an ordered iterable which can contain overlapping ranges, so the operation does not need to be planned ahead, but can be streamed, which supports very large output objects.
Scenarios such as below are then possible:
A C D G
| | | |
| cloud-AC | | cloud-DG |
| | | |
v v v v
############ #############
^ ^
| |
+---- desired file A-G --------+
| |
| |
| ######################### |
| ^ ^ |
| | | |
| | local file-BF | |
| | | |
A B C D E F G
>>> bucket = b2_api.get_bucket_by_name(bucket_name)
>>> def generate_input():
... yield WriteIntent(
... RemoteUploadSource('4_z5485a1682662eb3e60980d10_f113f963288e711a6_d20190404_m065910_c002_v0001095_t0044', offset=0, length=lengthC),
... destination_offset=0,
... )
... yield WriteIntent(
... LocalFileUploadSource('my_local_path/to_file.txt'), # length = offsetF - offsetB
... destination_offset=offsetB,
... )
... yield WriteIntent(
... RemoteUploadSource('4_z5485a1682662eb3e60980d10_f113f963288e711a6_d20190404_m065910_c002_v0001095_t0044', offset=0, length=offsetG-offsetD),
... destination_offset=offsetD,
... )
...
>>> file_info = {'how': 'good-file'}
>>> bucket.create_file(generate_input(), remote_name, file_info)
<b2sdk.file_version.FileVersionInfo at 0x7fc8cd560552>
In such case, if the sizes allow for it (there would be no parts smaller than absoluteMinimumPartSize), the only uploaded part will be C-D. Otherwise, more data will be uploaded, but the data transfer will be reduced in most cases. b2sdk.v1.Bucket.create_file()
does not guarantee that outbound transfer usage would be optimal, it uses a simple greedy algorithm with as small look-aheads as possible.
For more information see b2sdk.v1.Bucket.create_file()
.
Prioritize remote or local sources¶
b2sdk.v1.Bucket.create_file()
and b2sdk.v1.Bucket.create_file_stream()
support source/origin prioritization, so that planner would know which sources should be used for overlapping ranges. Supported values are: local, remote and local_verification.
A D G
| | |
| cloud-AD | |
| | |
v v |
################ |
^ |
| |
+---- desired file A-G --------+
| |
| |
| ####### #################
| ^ ^ ^ |
| | | | |
| | local file BC and DE |
| | | | |
A B C D E
A=0, B=50M, C=80M, D=100M, E=200
>>> bucket.create_file(input_sources, remote_name, file_info, prioritize='local')
# planner parts: cloud[A, B], local[B, C], remote[C, D], local[D, E]
Here the planner has only used a remote source where remote range was not available, minimizing downloads.
>>> planner.create_file(input_sources, remote_name, file_info, prioritize='remote')
# planner parts: cloud[A, D], local[D, E]
Here the planner has only used a local source where remote range was not available, minimizing uploads.
>>> bucket.create_file(input_sources, remote_name, file_info)
# or
>>> bucket.create_file(input_sources, remote_name, file_info, prioritize='local_verification')
# planner parts: cloud[A, B], cloud[B, C], cloud[C, D], local[D, E]
In local_verification mode the remote range was artificially split into three parts to allow for checksum verification against matching local ranges.
Note
prioritize is just a planner setting - remote parts are always verified if matching local parts exists.
Continuation¶
Continuation of upload¶
In order to continue a simple upload session, b2sdk checks for any available sessions with of the same file name
, file_info
and media type
, verifying the size of an object as much as possible.
To support automatic continuation, some advanced methods create a plan before starting copy/upload operations, saving the hash of that plan in file_info
for increased reliability.
If that is not available, large_file_id
can be extracted via callback during the operation start. It can then be passed into the subsequent call to continue the same task, though the responsibility for passing the exact same input is then on the user of the function. Please see advanced method support table to see where automatic continuation is supported. large_file_id
can also be passed if automatic continuation is available in order to avoid issues where multiple matchin upload sessions are matching the transfer.
Continuation of create/concantenate¶
b2sdk.v1.Bucket.create_file()
supports automatic continuation or manual continuation. b2sdk.v1.Bucket.create_file_stream()
supports only manual continuation for local-only inputs. The situation looks the same for b2sdk.v1.Bucket.concatenate()
and b2sdk.v1.Bucket.concatenate_stream()
(streamed version supports only manual continuation of local sources). Also b2sdk.v1.Bucket.upload()
and b2sdk.v2.Bucket.copy()
support both automatic and manual continuation.
Manual continuation¶
>>> def large_file_callback(large_file):
... # storage is not part of the interface - here only for demonstration purposes
... storage.store({'name': remote_name, 'large_file_id': large_file.id})
>>> bucket.create_file(input_sources, remote_name, file_info, large_file_callback=large_file_callback)
# ...
>>> large_file_id = storage.query({'name': remote_name})[0]['large_file_id']
>>> bucket.create_file(input_sources, remote_name, file_info, large_file_id=large_file_id)
Manual continuation (streamed version)¶
>>> def large_file_callback(large_file):
... # storage is not part of the interface - here only for demonstration purposes
... storage.store({'name': remote_name, 'large_file_id': large_file.id})
>>> bucket.create_file_stream(input_sources, remote_name, file_info, large_file_callback=large_file_callback)
# ...
>>> large_file_id = storage.query({'name': remote_name})[0]['large_file_id']
>>> bucket.create_file_stream(input_sources, remote_name, file_info, large_file_id=large_file_id)
Streams that contains remote sources cannot be continued with b2sdk.v1.Bucket.create_file()
- internally b2sdk.v1.Bucket.create_file()
stores plan information in file info for such inputs, and verifies it before any copy/upload and b2sdk.v1.Bucket.create_file_stream()
cannot store this information. Local source only inputs can be safely continued with b2sdk.v1.Bucket.create_file()
in auto continue mode or manual continue mode (because plan information is not stored in file info in such case).
Auto continuation¶
>>> bucket.create_file(input_sources, remote_name, file_info)
For local source only input, b2sdk.v1.Bucket.create_file()
would try to find matching unfinished large file. It will verify uploaded parts checksums with local sources - the most completed, having all uploaded parts matched candidate would be automatically selected as file to continue. If there is no matching candidate (even if there are unfinished files for the same file name) new large file would be started.
In other cases plan information would be generated and b2sdk.v1.Bucket.create_file()
would try to find unfinished large file with matching plan info in its file info. If there is one or more such unfinished large files, b2sdk.v1.Bucket.create_file()
would verify checksums for all locally available parts and choose any matching candidate. If all candidates fails on uploaded parts checksums verification, process is interrupted and error raises. In such case corrupted unfinished large files should be cancelled manullay and b2sdk.v1.Bucket.create_file()
should be retried, or auto continuation should be turned off with auto_continue=False
No continuation¶
>>> bucket.create_file(input_sources, remote_name, file_info, auto_continue=False)
Note, that this only forces start of a new large file - it is still possible to continue the process with either auto or manual modes.
Glossary¶
- absoluteMinimumPartSize
The smallest large file part size, as indicated during authorization process by the server (in 2019 it used to be
5MB
, but the server can set it dynamincally)- account ID
An identifier of the B2 account (not login). Looks like this:
4ba5845d7aaf
.- application key ID
Since every account ID can have multiple access keys associated with it, the keys need to be distinguished from each other. application key ID is an identifier of the access key. There are two types of keys: master application key and non-master application key.
- application key
The secret associated with an application key ID, used to authenticate with the server. Looks like this:
N2Zug0evLcHDlh_L0Z0AJhiGGdY
or0a1bce5ea463a7e4b090ef5bd6bd82b851928ab2c6
orK0014pbwo1zxcIVMnqSNTfWHReU/O3s
- b2sdk version
Looks like this:
v1.0.0
or1.0.0
and makes version numbers meaningful. See Pinning versions for more details.- b2sdk interface version
Looks like this:
v2
orb2sdk.v2
and makes maintaining backward compatibility much easier. See interface versions for more details.- master application key
This is the first key you have access to, it is available on the B2 web application. This key has all capabilities, access to all buckets, and has no file prefix restrictions or expiration. The application key ID of the master application key is equal to account ID.
- non-master application key
A key which can have restricted capabilities, can only have access to a certain bucket or even to just part of it. See https://www.backblaze.com/b2/docs/application_keys.html to learn more. Looks like this:
0014aa9865d6f0000000000b0
- bucket
A container that holds files. You can think of buckets as the top-level folders in your B2 Cloud Storage account. There is no limit to the number of files in a bucket, but there is a limit of 100 buckets per account. See https://www.backblaze.com/b2/docs/buckets.html to learn more.
About API interfaces¶
Semantic versioning¶
b2sdk follows Semantic Versioning policy, so in essence the version number is MAJOR.MINOR.PATCH
(for example 1.2.3
) and:
we increase MAJOR version when we make incompatible API changes
we increase MINOR version when we add functionality in a backwards-compatible manner, and
we increase PATCH version when we make backwards-compatible bug fixes (unless someone relies on the undocumented behavior of a fixed bug)
Therefore when setting up b2sdk as a dependency, please make sure to match the version appropriately, for example you could put this in your requirements.txt
to make sure your code is compatible with the b2sdk
version your user will get from pypi:
b2sdk>=1.0.0,<2.0.0
Interface versions¶
You might notice that the import structure provided in the documentation looks a little odd: from b2sdk.v1 import ...
.
The .v1
part is used to keep the interface fluid without risk of breaking applications that use the old signatures.
With new versions, b2sdk will provide functions with signatures matching the old ones, wrapping the new interface in place of the old one. What this means for a developer using b2sdk, is that it will just keep working. We havealready deleted some legacy functions when moving from .v0
to .v1
, providing equivalent wrappers to reduce the migration effort for applications using pre-1.0 versions of b2sdk to fixing imports.
It also means that b2sdk developers may change the interface in the future and will not need to maintain many branches and backport fixes to keep compatibility of for users of those old branches.
Interface version compatibility¶
A numbered interface will not be exactly identical throughout its lifespan, which should not be a problem for anyone, however just in case, the acceptable differences that the developer must tolerate, are listed below.
Exceptions¶
The exception hierarchy may change in a backwards compatible manner and the developer must anticipate it. For example, if b2sdk.v1.ExceptionC
inherits directly from b2sdk.v1.ExceptionA
, it may one day inherit from b2sdk.v1.ExceptionB
, which in turn inherits from b2sdk.v1.ExceptionA
. Normally this is not a problem if you use isinstance()
and super()
properly, but your code should not call the constructor of a parent class by directly naming it or it might skip the middle class of the hierarchy (ExceptionB
in this example).
Extensions¶
Even in the same interface version, objects/classes/enums can get additional fields and their representations such as to_dict()
or __repr__
(but not __str__
) may start to contain those fields.
Methods and functions can start accepting new optional arguments. New methods can be added to existing classes.
Performance¶
Some effort will be put into keeping the performance of the old interfaces, but in rare situations old interfaces may end up with a slightly degraded performance after a new version of the library is released.
If performance target is absolutely critical to your application, you can pin your dependencies to the middle version (using b2sdk>=X.Y.0,<X.Y+1.0
) as b2sdk will increment the middle version when introducing a new interface version if the wrapper for the older interfaces is likely to affect performance.
Public interface¶
Public interface consists of public members of modules listed in Public API section. This should be used in 99% of use cases, it’s enough to implement anything from a console tool to a FUSE filesystem.
Those modules will generally not change in a backwards-incompatible way between non-major versions. Please see interface version compatibility chapter for notes on what changes must be expected.
Hint
If the current version of b2sdk is 4.5.6
and you only use the public interface,
put this in your requirements.txt
to be safe:
b2sdk>=4.5.6,<5.0.0
Note
b2sdk.*._something
and b2sdk.*.*._something
, while having a name beginning with an underscore, are NOT considered public interface.
Internal interface¶
Some rarely used features of B2 cloud are not implemented in b2sdk. Tracking usage of transactions and transferred data is a good example - if it is required, additional work would need to be put into a specialized internal interface layer to enable accounting and reporting.
b2sdk maintainers are very supportive in case someone wants to contribute an additional feature. Please consider adding it to the sdk, so that more people can use it. This way it will also receive our updates, unlike a private implementation which would not receive any updates unless you apply them manually ( but that’s a lot of work and we both know it’s not going to happen). In practice, an implementation can be either shared or will quickly become outdated. The license of b2sdk is very permissive, but when considering whether to keep your patches private or public, please take into consideration the long-term cost of keeping up with a dynamic open-source project and/or the cost of missing the updates, especially those related to performance and reliability (as those are being actively developed in parallel to documentation).
Internal interface modules are listed in API Internal section.
Note
It is OK for you to use our internal interface (better that than copying our source files!), however, if you do, please pin your dependencies to middle version, as backwards-incompatible changes may be introduced in a non-major version.
Furthermore, it would be greatly appreciated if an issue was filed for such situations, so that b2sdk interface can be improved in a future version in order to avoid strict version pinning.
Hint
If the current version of b2sdk is 4.5.6
and you are using the internal interface,
put this in your requirements.txt:
b2sdk>=4.5.6,<4.6.0
Hint
Use Quick Start Guide to quickly jump to examples
API Reference¶
Interface types¶
b2sdk API is divided into two parts, public and internal. Please pay attention to which interface type you use.
Tip
Pinning versions properly ensures the stability of your application.
Public API¶
AccountInfo¶
AccountInfo stores basic information about the account, such as Application Key ID and Application Key,
in order to let b2sdk.v1.B2Api
perform authenticated requests.
There are two usable implementations provided by b2sdk:
b2sdk.v1.InMemoryAccountInfo
- a basic implementation with no persistence
b2sdk.v1.SqliteAccountInfo
- for console and GUI applications
They both provide the full AccountInfo interface.
Note
Backup applications and many server-side applications should implement their own AccountInfo, backed by the metadata/configuration database of the application.
AccountInfo implementations¶
InMemoryAccountInfo¶
AccountInfo with no persistence.
-
class
b2sdk.v1.
InMemoryAccountInfo
[source]¶ AccountInfo which keeps all data in memory.
Implements all methods of AccountInfo interface.
Hint
Usage of this class is appropriate for secure Web applications which do not wish to persist any user data.
Using this class for applications such as CLI, GUI or backup is discouraged, as
InMemoryAccountInfo
does not write down the authorization token persistently. That would be slow, as it would force the application to retrieve a new one on every command/click/backup start. Furthermore - an important property of AccountInfo is caching thebucket_name:bucket_id
mapping; in case ofInMemoryAccountInfo
the cache will be flushed between executions of the program.
SqliteAccountInfo¶
-
class
b2sdk.v1.
SqliteAccountInfo
[source]¶ Store account information in an sqlite3 database which is used to manage concurrent access to the data.
The
update_done
table tracks the schema updates that have been completed.Implements all methods of AccountInfo interface.
Uses a SQLite database for persistence and access synchronization between multiple processes. Not suitable for usage over NFS.
Underlying database has the following schema:
Hint
Usage of this class is appropriate for interactive applications installed on a user’s machine (i.e.: CLI and GUI applications).
Usage of this class might be appropriate for non-interactive applications installed on the user’s machine, such as backup applications. An alternative approach that should be considered is to store the AccountInfo data alongside the configuration of the rest of the application.
Implementing your own¶
When building a server-side application or a web service, you might want to implement your own AccountInfo class backed by a database. In such case, you should inherit from b2sdk.v1.UrlPoolAccountInfo
, which has groundwork for url pool functionality). If you cannot use it, inherit directly from b2sdk.v1.AbstractAccountInfo
.
>>> from b2sdk.v1 import UrlPoolAccountInfo
>>> class MyAccountInfo(UrlPoolAccountInfo):
...
b2sdk.v1.AbstractAccountInfo
describes the interface, while b2sdk.v1.UrlPoolAccountInfo
and b2sdk.v1.UploadUrlPool
implement a part of the interface for in-memory upload token management.
AccountInfo interface¶
-
class
b2sdk.v1.
AbstractAccountInfo
[source]¶ Abstract class for a holder for all account-related information that needs to be kept between API calls and between invocations of the program.
This includes: account ID, application key ID, application key, auth tokens, API URL, download URL, and uploads URLs.
This class must be THREAD SAFE because it may be used by multiple threads running in the same Python process. It also needs to be safe against multiple processes running at the same time.
-
REALM_URLS
= {'dev': 'http://api.backblazeb2.xyz:8180', 'production': 'https://api.backblazeb2.com', 'staging': 'https://api.backblaze.net'}¶
-
DEFAULT_ALLOWED
= {'bucketId': None, 'bucketName': None, 'capabilities': ['listKeys', 'writeKeys', 'deleteKeys', 'listBuckets', 'writeBuckets', 'deleteBuckets', 'listFiles', 'readFiles', 'shareFiles', 'writeFiles', 'deleteFiles'], 'namePrefix': None}¶
-
abstract
refresh_entire_bucket_name_cache
(name_id_iterable)[source]¶ Remove all previous name-to-id mappings and stores new ones.
- Parameters
name_id_iterable (iterable) – an iterable of tuples of the form (name, id)
-
abstract
remove_bucket_name
(bucket_name)[source]¶ Remove one entry from the bucket name cache.
- Parameters
bucket_name (str) – a bucket name
-
abstract
save_bucket
(bucket)[source]¶ Remember the ID for the given bucket name.
- Parameters
bucket (b2sdk.v1.Bucket) – a Bucket object
-
abstract
get_bucket_id_or_none_from_bucket_name
(bucket_name)[source]¶ Look up the bucket ID for the given bucket name.
-
abstract
clear_bucket_upload_data
(bucket_id)[source]¶ Remove all upload URLs for the given bucket.
- Parameters
bucket_id (str) – a bucket ID
-
abstract
get_account_id
()[source]¶ Return account ID or raises MissingAccountData exception.
- Return type
-
abstract
get_application_key_id
()[source]¶ Return the application key ID used to authenticate.
- Return type
-
abstract
get_account_auth_token
()[source]¶ Return account_auth_token or raises MissingAccountData exception.
- Return type
-
abstract
get_application_key
()[source]¶ Return application_key or raises MissingAccountData exception.
- Return type
-
abstract
get_download_url
()[source]¶ Return download_url or raises MissingAccountData exception.
- Return type
-
abstract
get_minimum_part_size
()[source]¶ Return the minimum number of bytes in a part of a large file.
- Returns
number of bytes
- Return type
-
abstract
get_allowed
()[source]¶ An ‘allowed’ dict, as returned by
b2_authorize_account
. NeverNone
; for account info that was saved before ‘allowed’ existed, returnsDEFAULT_ALLOWED
.- Return type
-
set_auth_data
(account_id, auth_token, api_url, download_url, minimum_part_size, application_key, realm, allowed=None, application_key_id=None)[source]¶ Check permission correctness and stores the results of
b2_authorize_account
.The allowed structure is the one returned by
b2_authorize_account
with an addition of a bucketName field. For keys with bucket restrictions, the name of the bucket is looked up and stored as well. The console_tool does everything by bucket name, so it’s convenient to have the restricted bucket name handy.- Parameters
account_id (str) – user account ID
auth_token (str) – user authentication token
api_url (str) – an API URL
download_url (str) – path download URL
minimum_part_size (int) – minimum size of the file part
application_key (str) – application key
realm (str) – a realm to authorize account in
allowed (dict) – the structure to use for old account info that was saved without ‘allowed’
application_key_id (str) – application key ID
Changed in version 0.1.5: account_id_or_app_key_id renamed to application_key_id
-
classmethod
allowed_is_valid
(allowed)[source]¶ Make sure that all of the required fields are present, and that bucketId is set if bucketName is.
If the bucketId is for a bucket that no longer exists, or the capabilities do not allow for listBuckets, then we will not have a bucketName.
-
abstract
_set_auth_data
(account_id, auth_token, api_url, download_url, minimum_part_size, application_key, realm, allowed, application_key_id)[source]¶ Actually store the auth data. Can assume that ‘allowed’ is present and valid.
All of the information returned by
b2_authorize_account
is saved, because all of it is needed at some point.
-
abstract
take_bucket_upload_url
(bucket_id)[source]¶ Return a pair (upload_url, upload_auth_token) that has been removed from the pool for this bucket, or (None, None) if there are no more left.
-
abstract
put_bucket_upload_url
(bucket_id, upload_url, upload_auth_token)[source]¶ Add an (upload_url, upload_auth_token) pair to the pool available for the bucket.
-
abstract
put_large_file_upload_url
(file_id, upload_url, upload_auth_token)[source]¶ Put a large file upload URL into a pool.
-
AccountInfo helper classes¶
-
class
b2sdk.v1.
UrlPoolAccountInfo
[source]¶ Implement part of
AbstractAccountInfo
for upload URL pool management with a simple, key-value storage, such asb2sdk.v1.UploadUrlPool
.Caution
This class is not part of the public interface. To find out how to safely use it, read this.
-
BUCKET_UPLOAD_POOL_CLASS
¶ A url pool class to use for small files.
-
LARGE_FILE_UPLOAD_POOL_CLASS
¶ A url pool class to use for large files.
-
-
class
b2sdk.account_info.upload_url_pool.
UploadUrlPool
[source]¶ For each key (either a bucket id or large file id), hold a pool of (url, auth_token) pairs.
Caution
This class is not part of the public interface. To find out how to safely use it, read this.
Cache¶
b2sdk caches the mapping between bucket name and bucket id, so that the user of the library does not need to maintain the mapping to call the api.
B2 Api client¶
-
class
b2sdk.v1.
B2Api
[source]¶ Provide file-level access to B2 services.
While
b2sdk.v1.B2RawApi
provides direct access to the B2 web APIs, this class handles several things that simplify the task of uploading and downloading files:re-acquires authorization tokens when they expire
retrying uploads when an upload URL is busy
breaking large files into parts
emulating a directory structure (B2 buckets are flat)
Adds an object-oriented layer on top of the raw API, so that buckets and files returned are Python objects with accessor methods.
The class also keeps a cache of information needed to access the service, such as auth tokens and upload URLs.
-
BUCKET_FACTORY_CLASS
¶ alias of
b2sdk.bucket.BucketFactory
-
BUCKET_CLASS
¶ alias of
b2sdk.bucket.Bucket
-
__init__
(account_info=None, cache=None, raw_api=None, max_upload_workers=10, max_copy_workers=10)[source]¶ Initialize the API using the given account info.
- Parameters
account_info – an instance of
UrlPoolAccountInfo
, or any custom class derived fromAbstractAccountInfo
To learn more about Account Info objects, see hereSqliteAccountInfo
cache – an instance of the one of the following classes:
DummyCache
,InMemoryCache
,AuthInfoCache
, or any custom class derived fromAbstractCache
It is used by B2Api to cache the mapping between bucket name and bucket ids. default isDummyCache
raw_api – an instance of one of the following classes:
B2RawApi
,RawSimulator
, or any custom class derived fromAbstractRawApi
It makes network-less unit testing simple by usingRawSimulator
, in tests andB2RawApi
in production. default isB2RawApi
max_upload_workers (int) – a number of upload threads, default is 10
max_copy_workers (int) – a number of copy threads, default is 10
-
property
account_info
¶
-
property
cache
¶
-
property
raw_api
¶
Perform automatic account authorization, retrieving all account data from account info object passed during initialization.
Perform account authorization.
- Parameters
realm (str) – a realm to authorize account in (usually just “production”)
application_key_id (str) – application key ID
application_key (str) – user’s application key
-
create_bucket
(name, bucket_type, bucket_info=None, cors_rules=None, lifecycle_rules=None)[source]¶ Create a bucket.
- Parameters
name (str) – bucket name
bucket_type (str) – a bucket type, could be one of the following values:
"allPublic"
,"allPrivate"
bucket_info (dict) – additional bucket info to store with the bucket
cors_rules (dict) – bucket CORS rules to store with the bucket
lifecycle_rules (dict) – bucket lifecycle rules to store with the bucket
- Returns
a Bucket object
- Return type
-
download_file_by_id
(file_id, download_dest, progress_listener=None, range_=None)[source]¶ Download a file with the given ID.
- Parameters
file_id (str) – a file ID
download_dest – an instance of the one of the following classes:
DownloadDestLocalFile
,DownloadDestBytes
,DownloadDestProgressWrapper
,PreSeekedDownloadDest
, or any sub class ofAbstractDownloadDestination
progress_listener – an instance of the one of the following classes:
PartProgressReporter
,TqdmProgressListener
,SimpleProgressListener
,DoNothingProgressListener
,ProgressListenerForTest
,SyncFileReporter
, or any sub class ofAbstractProgressListener
range (list) – a list of two integers, the first one is a start position, and the second one is the end position in the file
- Returns
context manager that returns an object that supports iter_content()
-
get_bucket_by_id
(bucket_id)[source]¶ Return a bucket object with a given ID. Unlike
get_bucket_by_name
, this method does not need to make any API calls.- Parameters
bucket_id (str) – a bucket ID
- Returns
a Bucket object
- Return type
-
get_bucket_by_name
(bucket_name)[source]¶ Return the Bucket matching the given bucket_name.
- Parameters
bucket_name (str) – the name of the bucket to return
- Returns
a Bucket object
- Return type
- Raises
b2sdk.v1.exception.NonExistentBucket – if the bucket does not exist in the account
-
delete_bucket
(bucket)[source]¶ Delete a chosen bucket.
- Parameters
bucket (b2sdk.v1.Bucket) – a bucket to delete
- Return type
-
list_buckets
(bucket_name=None)[source]¶ Call
b2_list_buckets
and return a list of buckets.When no bucket name is specified, returns all of the buckets in the account. When a bucket name is given, returns just that bucket. When authorized with an application key restricted to one bucket, you must specify the bucket name, or the request will be unauthorized.
- Parameters
bucket_name (str) – the name of the one bucket to return
- Return type
-
list_parts
(file_id, start_part_number=None, batch_size=None)[source]¶ Generator that yields a
b2sdk.v1.Part
for each of the parts that have been uploaded.
-
delete_file_version
(file_id, file_name)[source]¶ Permanently and irrevocably delete one version of a file.
- Parameters
- Return type
-
get_download_url_for_fileid
(file_id)[source]¶ Return a URL to download the given file by ID.
- Parameters
file_id (str) – a file ID
-
get_download_url_for_file_name
(bucket_name, file_name)[source]¶ Return a URL to download the given file by name.
-
create_key
(capabilities, key_name, valid_duration_seconds=None, bucket_id=None, name_prefix=None)[source]¶ Create a new application key.
- Parameters
capabilities (list) – a list of capabilities
key_name (str) – a name of a key
valid_duration_seconds (int,None) – key auto-expire time after it is created, in seconds, or
None
to not expirebucket_id (str,None) – a bucket ID to restrict the key to, or
None
to not restrictname_prefix (str,None) – a remote filename prefix to restrict the key to or
None
to not restrict
-
delete_key
(application_key_id)[source]¶ Delete application key.
- Parameters
application_key_id (str) – an application key ID
-
list_keys
(start_application_key_id=None)[source]¶ List application keys.
- Parameters
start_application_key_id (str,None) – an application key ID to start from or
None
to start from the beginning
-
check_bucket_restrictions
(bucket_name)[source]¶ Check to see if the allowed field from authorize-account has a bucket restriction.
If it does, checks if the bucket_name for a given api call matches that. If not, it raises a
b2sdk.v1.exception.RestrictedBucket
error.- Parameters
bucket_name (str) – a bucket name
- Raises
b2sdk.v1.exception.RestrictedBucket – if the account is not allowed to use this bucket
Exceptions¶
-
exception
b2sdk.v1.exception.
AccountInfoError
(*args, **kwargs)[source]¶ Base class for all account info errors.
-
exception
b2sdk.v1.exception.
B2Error
(*args, **kwargs)[source]¶ -
property
prefix
¶ Nice, auto-generated error message prefix.
>>> B2SimpleError().prefix 'Simple error' >>> AlreadyFailed().prefix 'Already failed'
-
property
-
exception
b2sdk.v1.exception.
B2SimpleError
(*args, **kwargs)[source]¶ A B2Error with a message prefix.
-
exception
b2sdk.v1.exception.
ClockSkew
(clock_skew_seconds)[source]¶ The clock on the server differs from the local clock by too much.
-
exception
b2sdk.v1.exception.
CommandError
(message)[source]¶ b2 command error (user caused). Accepts exactly one argument: message.
We expect users of shell scripts will parse our
__str__
output.
-
exception
b2sdk.v1.exception.
CorruptAccountInfo
(file_name)[source]¶ Raised when an account info file is corrupted.
-
exception
b2sdk.v1.exception.
DestFileNewer
(dest_file, source_file, dest_prefix, source_prefix)[source]¶
-
exception
b2sdk.v1.exception.
DuplicateBucketName
(*args, **kwargs)[source]¶ -
prefix
= 'Bucket name is already in use'¶
-
-
exception
b2sdk.v1.exception.
EnvironmentEncodingError
(filename, encoding)[source]¶ Raised when a file name can not be decoded with system encoding.
-
exception
b2sdk.v1.exception.
FileSha1Mismatch
(*args, **kwargs)[source]¶ -
prefix
= 'Upload file SHA1 mismatch'¶
-
-
exception
b2sdk.v1.exception.
InvalidArgument
(parameter_name, message)[source]¶ Raised when one or more arguments are invalid
-
exception
b2sdk.v1.exception.
InvalidAuthToken
(message, code)[source]¶ Specific type of Unauthorized that means the auth token is invalid. This is not the case where the auth token is valid, but does not allow access.
-
exception
b2sdk.v1.exception.
MissingAccountData
(key)[source]¶ Raised when there is no account info data available.
-
exception
b2sdk.v1.exception.
MissingPart
(*args, **kwargs)[source]¶ -
prefix
= 'Part number has not been uploaded'¶
-
-
exception
b2sdk.v1.exception.
NotAllowedByAppKeyError
(*args, **kwargs)[source]¶ Base class for errors caused by restrictions on an application key.
-
exception
b2sdk.v1.exception.
ServiceError
(*args, **kwargs)[source]¶ Used for HTTP status codes 500 through 599.
Return true if this is an error that should tell the upload code to get a new upload URL and try the upload again.
-
exception
b2sdk.v1.exception.
UnSyncableFilename
(message, filename)[source]¶ Raised when a filename is not supported by the sync operation
-
exception
b2sdk.v1.exception.
UnusableFileName
(*args, **kwargs)[source]¶ Raise when a filename doesn’t meet the rules.
Could possibly use InvalidUploadSource, but this is intended for the filename on the server, which could differ. https://www.backblaze.com/b2/docs/files.html.
B2 Bucket¶
-
class
b2sdk.v1.
Bucket
[source]¶ Provide access to a bucket in B2: listing files, uploading and downloading.
-
DEFAULT_CONTENT_TYPE
= 'b2/x-auto'¶
-
__init__
(api, id_, name=None, type_=None, bucket_info=None, cors_rules=None, lifecycle_rules=None, revision=None, bucket_dict=None, options_set=None)[source]¶ - Parameters
api (b2sdk.v1.B2Api) – an API object
id (str) – a bucket id
name (str) – a bucket name
type (str) – a bucket type
bucket_info (dict) – an info to store with a bucket
cors_rules (dict) – CORS rules to store with a bucket
lifecycle_rules (dict) – lifecycle rules to store with a bucket
revision (int) – a bucket revision number
bucket_dict (dict) – a dictionary which contains bucket parameters
options_set (set) – set of bucket options strings
-
set_type
(bucket_type)[source]¶ Update bucket type.
- Parameters
bucket_type (str) – a bucket type (“allPublic” or “allPrivate”)
-
update
(bucket_type=None, bucket_info=None, cors_rules=None, lifecycle_rules=None, if_revision_is=None)[source]¶ Update various bucket parameters.
- Parameters
bucket_type (str) – a bucket type
bucket_info (dict) – an info to store with a bucket
cors_rules (dict) – CORS rules to store with a bucket
lifecycle_rules (dict) – lifecycle rules to store with a bucket
if_revision_is (int) – revision number, update the info only if revision equals to if_revision_is
-
cancel_large_file
(file_id)[source]¶ Cancel a large file transfer.
- Parameters
file_id (str) – a file ID
-
download_file_by_id
(file_id, download_dest, progress_listener=None, range_=None)[source]¶ Download a file by ID.
Note
download_file_by_id actually belongs in
b2sdk.v1.B2Api
, not inb2sdk.v1.Bucket
; we just provide a convenient redirect here- Parameters
file_id (str) – a file ID
download_dest – an instance of the one of the following classes:
DownloadDestLocalFile
,DownloadDestBytes
,DownloadDestProgressWrapper
,PreSeekedDownloadDest
, or any sub class ofAbstractDownloadDestination
None progress_listener (b2sdk.v1.AbstractProgressListener,) – a progress listener object to use, or
None
to not report progressint] range (tuple[int,) – two integer values, start and end offsets
-
download_file_by_name
(file_name, download_dest, progress_listener=None, range_=None)[source]¶ Download a file by name.
See also
Synchronizer, a high-performance utility that synchronizes a local folder with a Bucket.
- Parameters
file_id (str) – a file ID
download_dest – an instance of the one of the following classes:
DownloadDestLocalFile
,DownloadDestBytes
,DownloadDestProgressWrapper
,PreSeekedDownloadDest
, or any sub class ofAbstractDownloadDestination
None progress_listener (b2sdk.v1.AbstractProgressListener,) – a progress listener object to use, or
None
to not track progressint] range (tuple[int,) – two integer values, start and end offsets
Return an authorization token that is valid only for downloading files from the given bucket.
-
list_parts
(file_id, start_part_number=None, batch_size=None)[source]¶ Get a list of all parts that have been uploaded for a given file.
-
list_file_versions
(file_name, fetch_count=None)[source]¶ Lists all of the versions for a single file.
- Parameters
- Return type
generator[b2sdk.v1.FileVersionInfo]
-
ls
(folder_to_list='', show_versions=False, recursive=False, fetch_count=None)[source]¶ Pretend that folders exist and yields the information about the files in a folder.
B2 has a flat namespace for the files in a bucket, but there is a convention of using “/” as if there were folders. This method searches through the flat namespace to find the files and “folders” that live within a given folder.
When the recursive flag is set, lists all of the files in the given folder, and all of its sub-folders.
- Parameters
folder_to_list (str) – the name of the folder to list; must not start with “/”. Empty string means top-level folder
show_versions (bool) – when
True
returns info about all versions of a file, whenFalse
, just returns info about the most recent versionsrecursive (bool) – if
True
, list folders recursivelyfetch_count (int,None) – how many entries to return or
None
to use the default. Acceptable values: 1 - 1000
- Return type
generator[tuple[b2sdk.v1.FileVersionInfo, str]]
- Returns
generator of (file_version_info, folder_name) tuples
Note
In case of recursive=True, folder_name is returned only for first file in the folder.
-
list_unfinished_large_files
(start_file_id=None, batch_size=None, prefix=None)[source]¶ A generator that yields an
b2sdk.v1.UnfinishedLargeFile
for each unfinished large file in the bucket, starting at the given file, filtering by prefix.
-
start_large_file
(file_name, content_type=None, file_info=None)[source]¶ Start a large file transfer.
-
upload_bytes
(data_bytes, file_name, content_type=None, file_infos=None, progress_listener=None)[source]¶ Upload bytes in memory to a B2 file.
- Parameters
data_bytes (bytes) – a byte array to upload
file_name (str) – a file name to upload bytes to
content_type (str,None) – the MIME type, or
None
to accept the default based on file extension of the B2 file namefile_infos (dict,None) – a file info to store with the file or
None
to not store anythingprogress_listener (b2sdk.v1.AbstractProgressListener,None) – a progress listener object to use, or
None
to not track progress
-
upload_local_file
(local_file, file_name, content_type=None, file_infos=None, sha1_sum=None, min_part_size=None, progress_listener=None)[source]¶ Upload a file on local disk to a B2 file.
See also
Synchronizer, a high-performance utility that synchronizes a local folder with a bucket.
- Parameters
local_file (str) – a path to a file on local disk
file_name (str) – a file name of the new B2 file
content_type (str,None) – the MIME type, or
None
to accept the default based on file extension of the B2 file namefile_infos (dict,None) – a file info to store with the file or
None
to not store anythingsha1_sum (str,None) – file SHA1 hash or
None
to compute it automaticallymin_part_size (int) – a minimum size of a part
progress_listener (b2sdk.v1.AbstractProgressListener,None) – a progress listener object to use, or
None
to not report progress
-
upload
(upload_source, file_name, content_type=None, file_info=None, min_part_size=None, progress_listener=None)[source]¶ Upload a file to B2, retrying as needed.
The source of the upload is an UploadSource object that can be used to open (and re-open) the file. The result of opening should be a binary file whose read() method returns bytes.
- Parameters
upload_source (b2sdk.v1.UploadSource) – an object that opens the source of the upload
file_name (str) – the file name of the new B2 file
content_type (str,None) – the MIME type, or
None
to accept the default based on file extension of the B2 file namefile_infos (dict,None) – a file info to store with the file or
None
to not store anythingmin_part_size (int,None) – the smallest part size to use or
None
to determine automaticallyprogress_listener (b2sdk.v1.AbstractProgressListener,None) – a progress listener object to use, or
None
to not report progress
The function opener should return a file-like object, and it must be possible to call it more than once in case the upload is retried.
-
create_file
(write_intents, file_name, content_type=None, file_info=None, progress_listener=None, recommended_upload_part_size=None, continue_large_file_id=None)[source]¶ Creates a new file in this bucket using an iterable (list, tuple etc) of remote or local sources.
Source ranges can overlap and remote sources will be prioritized over local sources (when possible). For more information and usage examples please see Advanced usage patterns.
- Parameters
write_intents (list[b2sdk.v1.WriteIntent]) – list of write intents (remote or local sources)
new_file_name (str) – file name of the new file
content_type (str,None) – content_type for the new file, if
None
content_type would be automatically determined or it may be copied if it resolves as single part remote source copyfile_info (dict,None) – file_info for the new file, if
None
it will be set to empty dict or it may be copied if it resolves as single part remote source copyprogress_listener (b2sdk.v1.AbstractProgressListener,None) – a progress listener object to use, or
None
to not report progressrecommended_upload_part_size (int,None) – the recommended part size to use for uploading local sources or
None
to determine automatically, but remote sources would be copied with maximum possible part sizecontinue_large_file_id (str,None) – large file id that should be selected to resume file creation for multipart upload/copy,
None
for automatic search for this id
-
create_file_stream
(write_intents_iterator, file_name, content_type=None, file_info=None, progress_listener=None, recommended_upload_part_size=None, continue_large_file_id=None)[source]¶ Creates a new file in this bucket using a stream of multiple remote or local sources.
Source ranges can overlap and remote sources will be prioritized over local sources (when possible). For more information and usage examples please see Advanced usage patterns.
- Parameters
write_intents_iterator (iterator[b2sdk.v1.WriteIntent]) – iterator of write intents which are sorted ascending by
destination_offset
new_file_name (str) – file name of the new file
content_type (str,None) – content_type for the new file, if
None
content_type would be automatically determined or it may be copied if it resolves as single part remote source copyfile_info (dict,None) – file_info for the new file, if
None
it will be set to empty dict or it may be copied if it resolves as single part remote source copyprogress_listener (b2sdk.v1.AbstractProgressListener,None) – a progress listener object to use, or
None
to not report progressrecommended_upload_part_size (int,None) – the recommended part size to use for uploading local sources or
None
to determine automatically, but remote sources would be copied with maximum possible part sizecontinue_large_file_id (str,None) – large file id that should be selected to resume file creation for multipart upload/copy, if
None
in multipart case it would always start a new large file
-
concatenate
(outbound_sources, file_name, content_type=None, file_info=None, progress_listener=None, recommended_upload_part_size=None, continue_large_file_id=None)[source]¶ Creates a new file in this bucket by concatenating multiple remote or local sources.
- Parameters
outbound_sources (list[b2sdk.v1.OutboundTransferSource]) – list of outbound sources (remote or local)
new_file_name (str) – file name of the new file
content_type (str,None) – content_type for the new file, if
None
content_type would be automatically determined from file name or it may be copied if it resolves as single part remote source copyfile_info (dict,None) – file_info for the new file, if
None
it will be set to empty dict or it may be copied if it resolves as single part remote source copyprogress_listener (b2sdk.v1.AbstractProgressListener,None) – a progress listener object to use, or
None
to not report progressrecommended_upload_part_size (int,None) – the recommended part size to use for uploading local sources or
None
to determine automatically, but remote sources would be copied with maximum possible part sizecontinue_large_file_id (str,None) – large file id that should be selected to resume file creation for multipart upload/copy,
None
for automatic search for this id
-
concatenate_stream
(outbound_sources_iterator, file_name, content_type=None, file_info=None, progress_listener=None, recommended_upload_part_size=None, continue_large_file_id=None)[source]¶ Creates a new file in this bucket by concatenating stream of multiple remote or local sources.
- Parameters
outbound_sources_iterator (iterator[b2sdk.v1.OutboundTransferSource]) – iterator of outbound sources
new_file_name (str) – file name of the new file
content_type (str,None) – content_type for the new file, if
None
content_type would be automatically determined or it may be copied if it resolves as single part remote source copyfile_info (dict,None) – file_info for the new file, if
None
it will be set to empty dict or it may be copied if it resolves as single part remote source copyprogress_listener (b2sdk.v1.AbstractProgressListener,None) – a progress listener object to use, or
None
to not report progressrecommended_upload_part_size (int,None) – the recommended part size to use for uploading local sources or
None
to determine automatically, but remote sources would be copied with maximum possible part sizecontinue_large_file_id (str,None) – large file id that should be selected to resume file creation for multipart upload/copy, if
None
in multipart case it would always start a new large file
-
copy
(file_id, new_file_name, content_type=None, file_info=None, offset=0, length=None, progress_listener=None)[source]¶ Creates a new file in this bucket by (server-side) copying from an existing file.
- Parameters
file_id (str) – file ID of existing file to copy from
new_file_name (str) – file name of the new file
content_type (str,None) – content_type for the new file, if
None
andb2_copy_file
will be used content_type will be copied from source file - otherwise content_type would be automatically determinedfile_info (dict,None) – file_info for the new file, if
None
will andb2_copy_file
will be used file_info will be copied from source file - otherwise it will be set to empty dictoffset (int) – offset of exisiting file that copy should start from
length (int,None) – number of bytes to copy, if
None
thenoffset
have to be0
and it will useb2_copy_file
withoutrange
parameter so it may fail if file is too large. For large files length have to be specified to useb2_copy_part
instead.progress_listener (b2sdk.v1.AbstractProgressListener,None) – a progress listener object to use for multipart copy, or
None
to not report progress
-
copy_file
(file_id, new_file_name, bytes_range=None, metadata_directive=None, content_type=None, file_info=None)[source]¶ Creates a new file in this bucket by (server-side) copying from an existing file.
- Parameters
file_id (str) – file ID of existing file
new_file_name (str) – file name of the new file
bytes_range (tuple[int,int],None) – start and end offsets (inclusive!), default is the entire file
metadata_directive (b2sdk.v1.MetadataDirectiveMode,None) – default is
b2sdk.v1.MetadataDirectiveMode.COPY
content_type (str,None) – content_type for the new file if metadata_directive is set to
b2sdk.v1.MetadataDirectiveMode.REPLACE
, default will copy the content_type of old filefile_info (dict,None) – file_info for the new file if metadata_directive is set to
b2sdk.v1.MetadataDirectiveMode.REPLACE
, default will copy the file_info of old file
-
Data classes¶
-
class
b2sdk.v1.
FileVersionInfo
(id_, file_name, size, content_type, content_sha1, file_info, upload_timestamp, action, content_md5=None)[source]¶ A structure which represents a version of a file (in B2 cloud).
- Variables
id_ (str) –
fileId
file_name (str) – full file name (with path)
content_type (str) – RFC 822 content type, for example
"application/octet-stream"
content_sha1 (str or None) – sha1 checksum of the entire file, can be
None
(unknown) if it is a large file uploaded by a client which did not provide itcontent_md5 (str or None) – md5 checksum of the file, can be
None
(unknown)file_info (dict) – file info dict
upload_timestamp (int or None) – in milliseconds since epoch. Can be
None
(unknown).action (str) –
"upload"
,"hide"
or"delete"
-
LS_ENTRY_TEMPLATE
= '%83s %6s %10s %8s %9d %s'¶
-
as_dict
()[source]¶ represents the object as a dict which looks almost exactly like the raw api output for upload/list
-
format_ls_entry
()[source]¶ legacy method, to be removed in v2: formats a ls entry for b2 command line tool
-
classmethod
format_folder_ls_entry
(name)[source]¶ legacy method, to be removed in v2: formats a ls “folder” consistently with format_ls_entry()
-
__dict__
= mappingproxy({'__module__': 'b2sdk.file_version', '__doc__': '\n A structure which represents a version of a file (in B2 cloud).\n\n :ivar str ~.id\\_: ``fileId``\n :ivar str ~.file_name: full file name (with path)\n :ivar ~.size: size in bytes, can be ``None`` (unknown)\n :vartype ~.size: int or None\n :ivar str ~.content_type: RFC 822 content type, for example ``"application/octet-stream"``\n :ivar ~.content_sha1: sha1 checksum of the entire file, can be ``None`` (unknown) if it is a large file uploaded by a client which did not provide it\n :vartype ~.content_sha1: str or None\n :ivar ~.content_md5: md5 checksum of the file, can be ``None`` (unknown)\n :vartype ~.content_md5: str or None\n :ivar dict ~.file_info: file info dict\n :ivar ~.upload_timestamp: in milliseconds since :abbr:`epoch (1970-01-01 00:00:00)`. Can be ``None`` (unknown).\n :vartype ~.upload_timestamp: int or None\n :ivar str ~.action: ``"upload"``, ``"hide"`` or ``"delete"``\n ', 'LS_ENTRY_TEMPLATE': '%83s %6s %10s %8s %9d %s', '__init__': <function FileVersionInfo.__init__>, 'as_dict': <function FileVersionInfo.as_dict>, 'format_ls_entry': <function FileVersionInfo.format_ls_entry>, 'format_folder_ls_entry': <classmethod object>, '__dict__': <attribute '__dict__' of 'FileVersionInfo' objects>, '__weakref__': <attribute '__weakref__' of 'FileVersionInfo' objects>, '__annotations__': {}})¶
-
class
b2sdk.v1.
FileIdAndName
(file_id, file_name)[source]¶ A structure which represents a B2 cloud file with just file_name and fileId attributes.
Used to return data from calls to
b2sdk.v1.Bucket.delete_file_version()
.-
as_dict
()[source]¶ represents the object as a dict which looks almost exactly like the raw api output for delete_file_version
-
__dict__
= mappingproxy({'__module__': 'b2sdk.file_version', '__doc__': '\n A structure which represents a B2 cloud file with just `file_name` and `fileId` attributes.\n\n Used to return data from calls to :py:meth:`b2sdk.v1.Bucket.delete_file_version`.\n\n :ivar str ~.file_id: ``fileId``\n :ivar str ~.file_name: full file name (with path)\n ', '__init__': <function FileIdAndName.__init__>, 'as_dict': <function FileIdAndName.as_dict>, '__dict__': <attribute '__dict__' of 'FileIdAndName' objects>, '__weakref__': <attribute '__weakref__' of 'FileIdAndName' objects>, '__annotations__': {}})¶
-
-
class
b2sdk.v1.
UnfinishedLargeFile
[source]¶ A structure which represents a version of a file (in B2 cloud).
Enums¶
-
class
b2sdk.v1.
MetadataDirectiveMode
(value)[source]¶ Mode of handling metadata when copying a file
-
COPY
= 401¶ copy metadata from the source file
-
REPLACE
= 402¶ ignore the source file metadata and set it to provided values
-
-
class
b2sdk.v1.
NewerFileSyncMode
(value)[source]¶ Mode of handling files newer on destination than on source
-
SKIP
= 101¶ skip syncing such file
-
REPLACE
= 102¶ replace the file on the destination with the (older) file on source
-
RAISE_ERROR
= 103¶ raise a non-transient error, failing the sync operation
-
-
class
b2sdk.v1.
CompareVersionMode
(value)[source]¶ Mode of comparing versions of files to determine what should be synced and what shouldn’t
-
MODTIME
= 201¶ use file modification time on source filesystem
-
SIZE
= 202¶ compare using file size
-
NONE
= 203¶ compare using file name only
-
-
class
b2sdk.v1.
KeepOrDeleteMode
(value)[source]¶ Mode of dealing with old versions of files on the destination
-
DELETE
= 301¶ delete the old version as soon as the new one has been uploaded
-
KEEP_BEFORE_DELETE
= 302¶ keep the old versions of the file for a configurable number of days before deleting them, always keeping the newest version
-
NO_DELETE
= 303¶ keep old versions of the file, do not delete anything
-
Progress reporters¶
Note
Concrete classes described in this chapter implement methods defined in AbstractProgressListener
-
class
b2sdk.v1.
AbstractProgressListener
[source]¶ Interface expected by B2Api upload and download methods to report on progress.
This interface just accepts the number of bytes transferred so far. Subclasses will need to know the total size if they want to report a percent done.
-
abstract
set_total_bytes
(total_byte_count)[source]¶ Always called before __enter__ to set the expected total number of bytes.
May be called more than once if an upload is retried.
- Parameters
total_byte_count (int) – expected total number of bytes
-
abstract
bytes_completed
(byte_count)[source]¶ Report the given number of bytes that have been transferred so far. This is not a delta, it is the total number of bytes transferred so far.
Transfer can fail and restart from beginning so byte count can decrease between calls.
- Parameters
byte_count (int) – number of bytes have been transferred
-
abstract
-
class
b2sdk.v1.
TqdmProgressListener
(description, *args, **kwargs)[source]¶ Progress listener based on tqdm library.
-
class
b2sdk.v1.
SimpleProgressListener
(description, *args, **kwargs)[source]¶ Just a simple progress listener which prints info on a console.
Synchronizer¶
Synchronizer is a powerful utility with functionality of a basic backup application. It is able to copy entire folders into the cloud and back to a local drive, providing retention policies and many other options.
The high performance of sync is credited to parallelization of:
listing local directory contents
listing bucket contents
uploads
downloads
Synchronizer spawns threads to perform the operations listed above in parallel to shorten the backup window to a minimum.
Sync Options¶
Following are the important optional arguments that can be provided while initializing Synchronizer class.
compare_version_mode
: When comparing the source and destination files for finding whether to replace them or not, compare_version_mode can be passed to specify the mode of comparision. For possible values seeb2sdk.v1.CompareVersionMode
. Default value isb2sdk.v1.CompareVersionMode.MODTIME
compare_threshold
: It’s the minimum size (in bytes)/modification time (in seconds) difference between source and destination files before we assume that it is new and replace.newer_file_mode
: To identify whether to skip or replace if source is older. For possible values seeb2sdk.v1.NewerFileSyncMode
. If you don’t specify this the sync will raiseb2sdk.v1.exception.DestFileNewer
in case any of the source file is older than destination.keep_days_or_delete
: specify policy to keep or delete older files. For possible values seeb2sdk.v1.KeepOrDeleteMode
. Default is DO_NOTHING.keep_days
: if keep_days_or_delete isb2sdk.v1.CompareVersionMode.KEEP_BEFORE_DELETE
then this specify for how many days should we keep.
>>> from b2sdk.v1 import ScanPoliciesManager
>>> from b2sdk.v1 import parse_sync_folder
>>> from b2sdk.v1 import Synchronizer
>>> from b2sdk.v1 import KeepOrDeleteMode, CompareVersionMode, NewerFileSyncMode
>>> import time
>>> import sys
>>> source = '/home/user1/b2_example'
>>> destination = 'b2://example-mybucket-b2'
>>> source = parse_sync_folder(source, b2_api)
>>> destination = parse_sync_folder(destination, b2_api)
>>> policies_manager = ScanPoliciesManager(exclude_all_symlinks=True)
>>> synchronizer = Synchronizer(
max_workers=10,
policies_manager=policies_manager,
dry_run=False,
allow_empty_source=True,
compare_version_mode=CompareVersionMode.SIZE,
compare_threshold=10,
newer_file_mode=NewerFileSyncMode.REPLACE,
keep_days_or_delete=KeepOrDeleteMode.KEEP_BEFORE_DELETE,
keep_days=10,
)
We have a file (hello.txt) which is present in destination but not on source (my local), so it will be deleted and since our mode is to keep the delete file, it will be hidden for 10 days in bucket.
>>> no_progress = False
>>> with SyncReport(sys.stdout, no_progress) as reporter:
synchronizer.sync_folders(
source_folder=source,
dest_folder=destination,
now_millis=int(round(time.time() * 1000)),
reporter=reporter,
)
upload f1.txt
delete hello.txt (old version)
hide hello.txt
We changed f1.txt and added 1 byte. Since our compare_threshold is 10, it will not do anything.
>>> with SyncReport(sys.stdout, no_progress) as reporter:
synchronizer.sync_folders(
source_folder=source,
dest_folder=destination,
now_millis=int(round(time.time() * 1000)),
reporter=reporter,
)
We changed f1.txt and added more than 10 bytes. Since our compare_threshold is 10, it will replace the file at destination folder.
>>> with SyncReport(sys.stdout, no_progress) as reporter:
synchronizer.sync_folders(
source_folder=source,
dest_folder=destination,
now_millis=int(round(time.time() * 1000)),
reporter=reporter,
)
upload f1.txt
Let’s just delete the file and not keep - keep_days_or_delete = DELETE You can avoid passing keep_days argument in this case because it will be ignored anyways
>>> synchronizer = Synchronizer(
max_workers=10,
policies_manager=policies_manager,
dry_run=False,
allow_empty_source=True,
compare_version_mode=CompareVersionMode.SIZE,
compare_threshold=10, # in bytes
newer_file_mode=NewerFileSyncMode.REPLACE,
keep_days_or_delete=KeepOrDeleteMode.DELETE,
)
>>> with SyncReport(sys.stdout, no_progress) as reporter:
synchronizer.sync_folders(
source_folder=source,
dest_folder=destination,
now_millis=int(round(time.time() * 1000)),
reporter=reporter,
)
delete f1.txt
delete f1.txt (old version)
delete hello.txt (old version)
upload f2.txt
delete hello.txt (hide marker)
As you can see, it deleted f1.txt and it’s older versions (no hide this time) and deleted hello.txt also because now we don’t want the file anymore. also, we added another file f2.txt which gets uploaded.
Now we changed newer_file_mode to SKIP and compare_version_mode to MODTIME. also uploaded a new version of f2.txt to bucket using B2 web.
>>> synchronizer = Synchronizer(
max_workers=10,
policies_manager=policies_manager,
dry_run=False,
allow_empty_source=True,
compare_version_mode=CompareVersionMode.MODTIME,
compare_threshold=10, # in seconds
newer_file_mode=NewerFileSyncMode.SKIP,
keep_days_or_delete=KeepOrDeleteMode.DELETE,
)
>>> with SyncReport(sys.stdout, no_progress) as reporter:
synchronizer.sync_folders(
source_folder=source,
dest_folder=destination,
now_millis=int(round(time.time() * 1000)),
reporter=reporter,
)
As expected, nothing happened, it found a file that was older at source but did not do anything because we skipped.
Now we changed newer_file_mode again to REPLACE and also uploaded a new version of f2.txt to bucket using B2 web.
>>> synchronizer = Synchronizer(
max_workers=10,
policies_manager=policies_manager,
dry_run=False,
allow_empty_source=True,
compare_version_mode=CompareVersionMode.MODTIME,
compare_threshold=10,
newer_file_mode=NewerFileSyncMode.REPLACE,
keep_days_or_delete=KeepOrDeleteMode.DELETE,
)
>>> with SyncReport(sys.stdout, no_progress) as reporter:
synchronizer.sync_folders(
source_folder=source,
dest_folder=destination,
now_millis=int(round(time.time() * 1000)),
reporter=reporter,
)
delete f2.txt (old version)
upload f2.txt
-
class
b2sdk.v1.
ScanPoliciesManager
[source]¶ Policy object used when scanning folders for syncing, used to decide which files to include in the list of files to be synced.
Code that scans through files should at least use should_exclude_file() to decide whether each file should be included; it will check include/exclude patterns for file names, as well as patterns for excluding directories.
Code that scans may optionally use should_exclude_directory() to test whether it can skip a directory completely and not bother listing the files and sub-directories in it.
-
__init__
(exclude_dir_regexes=(), exclude_file_regexes=(), include_file_regexes=(), exclude_all_symlinks=False, exclude_modified_before=None, exclude_modified_after=None)[source]¶ - Parameters
exclude_dir_regexes (tuple) – a tuple of regexes to exclude directories
exclude_file_regexes (tuple) – a tuple of regexes to exclude files
include_file_regexes (tuple) – a tuple of regexes to include files
exclude_all_symlinks (bool) – if True, exclude all symlinks
exclude_modified_before (int, optional) – optionally exclude file versions modified before (in millis)
exclude_modified_after (int, optional) – optionally exclude file versions modified after (in millis)
-
should_exclude_file
(file_path)[source]¶ Given the full path of a file, decide if it should be excluded from the scan.
-
should_exclude_file_version
(file_version)[source]¶ Given the modification time of a file version, decide if it should be excluded from the scan.
- Parameters
file_version – the file version object
- Type
b2sdk.v1.FileVersion
- Returns
True if excluded.
- Return type
-
should_exclude_directory
(dir_path)[source]¶ Given the full path of a directory, decide if all of the files in it should be excluded from the scan.
- Parameters
dir_path (str) – the path of the directory, relative to the root directory being scanned. The path will never end in ‘/’.
- Returns
True if excluded.
-
-
class
b2sdk.v1.
Synchronizer
[source]¶ -
__init__
(max_workers, policies_manager=<b2sdk.sync.scan_policies.ScanPoliciesManager object>, dry_run=False, allow_empty_source=False, newer_file_mode=<NewerFileSyncMode.RAISE_ERROR: 103>, keep_days_or_delete=<KeepOrDeleteMode.NO_DELETE: 303>, compare_version_mode=<CompareVersionMode.MODTIME: 201>, compare_threshold=None, keep_days=None)[source]¶ Initialize synchronizer class and validate arguments
- Parameters
max_workers (int) – max number of workers
policies_manager – policies manager object
dry_run (bool) – test mode, does not actually transfer/delete when enabled
allow_empty_source (bool) – if True, do not check whether source folder is empty
newer_file_mode (b2sdk.v1.NewerFileSyncMode) – setting which determines handling for destination files newer than on the source
keep_days_or_delete (b2sdk.v1.KeepOrDeleteMode) – setting which determines if we should delete or not delete or keep for keep_days
compare_version_mode (b2sdk.v1.CompareVersionMode) – how to compare the source and destination files to find new ones
compare_threshold (int) – should be greater than 0, default is 0
keep_days (int) – if keep_days_or_delete is b2sdk.v1.KeepOrDeleteMode.KEEP_BEFORE_DELETE, then this should be greater than 0
-
sync_folders
(source_folder, dest_folder, now_millis, reporter)[source]¶ Syncs two folders. Always ensures that every file in the source is also in the destination. Deletes any file versions in the destination older than history_days.
- Parameters
source_folder (b2sdk.sync.folder.AbstractFolder) – source folder object
dest_folder (b2sdk.sync.folder.AbstractFolder) – destination folder object
now_millis (int) – current time in milliseconds
reporter (b2sdk.sync.report.SyncReport,None) – progress reporter
-
make_folder_sync_actions
(source_folder, dest_folder, now_millis, reporter, policies_manager=<b2sdk.sync.scan_policies.ScanPoliciesManager object>)[source]¶ Yield a sequence of actions that will sync the destination folder to the source folder.
- Parameters
source_folder (b2sdk.v1.AbstractFolder) – source folder object
dest_folder (b2sdk.v1.AbstractFolder) – destination folder object
now_millis (int) – current time in milliseconds
reporter (b2sdk.v1.SyncReport) – reporter object
policies_manager – policies manager object
-
make_file_sync_actions
(sync_type, source_file, dest_file, source_folder, dest_folder, now_millis)[source]¶ Yields the sequence of actions needed to sync the two files
- Parameters
sync_type (str) – synchronization type
source_file (b2sdk.v1.File) – source file object
dest_file (b2sdk.v1.File) – destination file object
source_folder (b2sdk.v1.AbstractFolder) – a source folder object
dest_folder (b2sdk.v1.AbstractFolder) – a destination folder object
now_millis (int) – current time in milliseconds
-
-
class
b2sdk.v1.
SyncReport
[source]¶ Handle reporting progress for syncing.
Print out each file as it is processed, and puts up a sequence of progress bars.
- The progress bars are:
Step 1/1: count local files
Step 2/2: compare file lists
Step 3/3: transfer files
This class is THREAD SAFE, so it can be used from parallel sync threads.
-
UPDATE_INTERVAL
= 0.1¶
-
__init__
(stdout, no_progress)[source]¶ - Parameters
stdout – standard output file object
no_progress (bool) – if True, do not show progress
-
error
(message)[source]¶ Print an error, gracefully interleaving it with a progress bar.
- Parameters
message (str) – an error message
-
print_completion
(message)[source]¶ Remove the progress bar, prints a message, and puts the progress bar back.
- Parameters
message (str) – an error message
-
update_local
(delta)[source]¶ Report that more local files have been found.
- Parameters
delta (int) – number of files found since the last check
-
update_compare
(delta)[source]¶ Report that more files have been compared.
- Parameters
delta (int) – number of files compared
-
end_compare
(total_transfer_files, total_transfer_bytes)[source]¶ Report that the comparison has been finished.
-
local_access_error
(path)[source]¶ Add a file access error message to the list of warnings.
- Parameters
path (str) – file path
B2 Utility functions¶
-
b2sdk.v1.
choose_part_ranges
(content_length, minimum_part_size)[source]¶ Return a list of (offset, length) for the parts of a large file.
-
b2sdk.v1.
fix_windows_path_limit
(path)[source]¶ Prefix paths when running on Windows to overcome 260 character path length limit. See https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx#maxpath
-
b2sdk.v1.
format_and_scale_fraction
(numerator, denominator, unit)[source]¶ Pick a good scale for representing a fraction, and format it.
-
b2sdk.v1.
format_and_scale_number
(x, unit)[source]¶ Pick a good scale for representing a number and format it.
-
b2sdk.v1.
hex_sha1_of_stream
(input_stream, content_length)[source]¶ Return the 40-character hex SHA1 checksum of the first content_length bytes in the input stream.
Write intent¶
-
class
b2sdk.v1.
WriteIntent
[source]¶ Wrapper for outbound source that defines destination offset.
-
__init__
(outbound_source, destination_offset=0)[source]¶ - Parameters
outbound_source (b2sdk.v1.OutboundTransferSource) – data source (remote or local)
destination_offset (int) – point of start in destination file
-
classmethod
wrap_sources_iterator
(outbound_sources_iterator)[source]¶ Helper that wraps outbound sources iterator with write intents.
Can be used in cases similar to concatenate to automatically compute destination offsets
- Param
iterator[b2sdk.v1.OutboundTransferSource] outbound_sources_iterator: iterator of outbound sources
- Return type
generator[b2sdk.v1.WriteIntent]
-
Outbound Transfer Source¶
-
class
b2sdk.v1.
OutboundTransferSource
[source]¶ Abstract class for defining outbound transfer sources.
Supported outbound transfer sources are:
b2sdk.v1.CopySource
b2sdk.v1.UploadSourceBytes
b2sdk.v1.UploadSourceLocalFile
b2sdk.v1.UploadSourceLocalFileRange
b2sdk.v1.UploadSourceStream
b2sdk.v1.UploadSourceStreamRange
Download destination¶
Note
Concrete classes described in this chapter implement methods defined in AbstractDownloadDestination
-
class
b2sdk.v1.
AbstractDownloadDestination
[source]¶ Interface to a destination for a downloaded file.
-
abstract
make_file_context
(file_id, file_name, content_length, content_type, content_sha1, file_info, mod_time_millis, range_=None)[source]¶ Return a context manager that yields a binary file-like object to use for writing the contents of the file.
- Parameters
file_id (str) – the B2 file ID from the headers
file_name (str) – the B2 file name from the headers
content_type (str) – the content type from the headers
content_sha1 (str) – the content sha1 from the headers (or
"none"
for large files)file_info (dict) – the user file info from the headers
mod_time_millis (int) – the desired file modification date in ms since 1970-01-01
range (None,tuple[int,int]) – starting and ending offsets of the received file contents. Usually
None
, which means that the whole file is downloaded.
- Returns
None
-
abstract
-
class
b2sdk.v1.
DownloadDestLocalFile
[source]¶ Store a downloaded file into a local file and sets its modification time.
-
class
b2sdk.v1.
DownloadDestProgressWrapper
[source]¶ Wrap a DownloadDestination and report progress to a ProgressListener.
Internal API¶
Note
See Internal interface chapter to learn when and how to safely use the Internal API
b2sdk.session
– B2 Session¶
-
class
b2sdk.session.
TokenType
(value)[source]¶ Bases:
enum.Enum
An enumeration.
-
API
= 'api'¶
-
API_TOKEN_ONLY
= 'api_token_only'¶
-
UPLOAD_PART
= 'upload_part'¶
-
UPLOAD_SMALL
= 'upload_small'¶
-
-
class
b2sdk.session.
B2Session
(account_info=None, cache=None, raw_api=None)[source]¶ Bases:
object
A facade that supplies the correct api_url and account_auth_token to methods of underlying raw_api and reauthorizes if necessary.
-
__init__
(account_info=None, cache=None, raw_api=None)[source]¶ Initialize Session using given account info.
- Parameters
account_info – an instance of
UrlPoolAccountInfo
, or any custom class derived fromAbstractAccountInfo
To learn more about Account Info objects, see hereSqliteAccountInfo
cache – an instance of the one of the following classes:
DummyCache
,InMemoryCache
,AuthInfoCache
, or any custom class derived fromAbstractCache
It is used by B2Api to cache the mapping between bucket name and bucket ids. default isDummyCache
raw_api – an instance of one of the following classes:
B2RawApi
,RawSimulator
, or any custom class derived fromAbstractRawApi
It makes network-less unit testing simple by usingRawSimulator
, in tests andB2RawApi
in production. default isB2RawApi
Perform automatic account authorization, retrieving all account data from account info object passed during initialization.
Perform account authorization.
- Parameters
realm (str) – a realm to authorize account in (usually just “production”)
application_key_id (str) – application key ID
application_key (str) – user’s application key
-
create_bucket
(account_id, bucket_name, bucket_type, bucket_info=None, cors_rules=None, lifecycle_rules=None)[source]¶
-
create_key
(account_id, capabilities, key_name, valid_duration_seconds, bucket_id, name_prefix)[source]¶
-
list_file_versions
(bucket_id, start_file_name=None, start_file_id=None, max_file_count=None, prefix=None)[source]¶
-
list_unfinished_large_files
(bucket_id, start_file_id=None, max_file_count=None, prefix=None)[source]¶
-
update_bucket
(account_id, bucket_id, bucket_type=None, bucket_info=None, cors_rules=None, lifecycle_rules=None, if_revision_is=None)[source]¶
-
upload_file
(bucket_id, file_name, content_length, content_type, content_sha1, file_infos, data_stream)[source]¶
-
b2sdk.raw_api
– B2 raw api wrapper¶
-
class
b2sdk.raw_api.
MetadataDirectiveMode
(value)[source]¶ Bases:
enum.Enum
Mode of handling metadata when copying a file
-
COPY
= 401¶ copy metadata from the source file
-
REPLACE
= 402¶ ignore the source file metadata and set it to provided values
-
-
class
b2sdk.raw_api.
AbstractRawApi
[source]¶ Bases:
object
Direct access to the B2 web apis.
-
abstract
copy_file
(api_url, account_auth_token, source_file_id, new_file_name, bytes_range=None, metadata_directive=None, content_type=None, file_info=None, destination_bucket_id=None)[source]¶
-
abstract
copy_part
(api_url, account_auth_token, source_file_id, large_file_id, part_number, bytes_range=None)[source]¶
-
abstract
create_bucket
(api_url, account_auth_token, account_id, bucket_name, bucket_type, bucket_info=None, cors_rules=None, lifecycle_rules=None)[source]¶
-
abstract
create_key
(api_url, account_auth_token, account_id, capabilities, key_name, valid_duration_seconds, bucket_id, name_prefix)[source]¶
-
abstract
list_buckets
(api_url, account_auth_token, account_id, bucket_id=None, bucket_name=None)[source]¶
-
abstract
list_file_names
(api_url, account_auth_token, bucket_id, start_file_name=None, max_file_count=None, prefix=None)[source]¶
-
abstract
list_file_versions
(api_url, account_auth_token, bucket_id, start_file_name=None, start_file_id=None, max_file_count=None, prefix=None)[source]¶
-
abstract
list_keys
(api_url, account_auth_token, account_id, max_key_count=None, start_application_key_id=None)[source]¶
-
abstract
list_parts
(api_url, account_auth_token, file_id, start_part_number, max_part_count)[source]¶
-
abstract
list_unfinished_large_files
(api_url, account_auth_token, bucket_id, start_file_id=None, max_file_count=None, prefix=None)[source]¶
-
abstract
start_large_file
(api_url, account_auth_token, bucket_id, file_name, content_type, file_info)[source]¶
-
abstract
update_bucket
(api_url, account_auth_token, account_id, bucket_id, bucket_type=None, bucket_info=None, cors_rules=None, lifecycle_rules=None, if_revision_is=None)[source]¶
-
abstract
upload_file
(upload_url, upload_auth_token, file_name, content_length, content_type, content_sha1, file_infos, data_stream)[source]¶
-
abstract
-
class
b2sdk.raw_api.
B2RawApi
(b2_http)[source]¶ Bases:
b2sdk.raw_api.AbstractRawApi
Provide access to the B2 web APIs, exactly as they are provided by b2.
Requires that you provide all necessary URLs and auth tokens for each call.
Each API call decodes the returned JSON and returns a dict.
- For details on what each method does, see the B2 docs:
This class is intended to be a super-simple, very thin layer on top of the HTTP calls. It can be mocked-out for testing higher layers. And this class can be tested by exercising each call just once, which is relatively quick.
-
create_bucket
(api_url, account_auth_token, account_id, bucket_name, bucket_type, bucket_info=None, cors_rules=None, lifecycle_rules=None)[source]¶
-
create_key
(api_url, account_auth_token, account_id, capabilities, key_name, valid_duration_seconds, bucket_id, name_prefix)[source]¶
-
download_file_from_url
(account_auth_token_or_none, url, range_=None)[source]¶ Issue a streaming request for download of a file, potentially authorized.
- Parameters
account_auth_token_or_none – an optional account auth token to pass in
url – the full URL to download from
range – two-element tuple for http Range header
- Returns
b2_http response
-
list_file_names
(api_url, account_auth_token, bucket_id, start_file_name=None, max_file_count=None, prefix=None)[source]¶
-
list_file_versions
(api_url, account_auth_token, bucket_id, start_file_name=None, start_file_id=None, max_file_count=None, prefix=None)[source]¶
-
list_keys
(api_url, account_auth_token, account_id, max_key_count=None, start_application_key_id=None)[source]¶
-
list_unfinished_large_files
(api_url, account_auth_token, bucket_id, start_file_id=None, max_file_count=None, prefix=None)[source]¶
-
start_large_file
(api_url, account_auth_token, bucket_id, file_name, content_type, file_info)[source]¶
-
update_bucket
(api_url, account_auth_token, account_id, bucket_id, bucket_type=None, bucket_info=None, cors_rules=None, lifecycle_rules=None, if_revision_is=None)[source]¶
-
unprintable_to_hex
(string)[source]¶ Replace unprintable chars in string with a hex representation.
- Parameters
string – an arbitrary string, possibly with unprintable characters.
- Returns
the string, with unprintable characters changed to hex (e.g., “”)
-
check_b2_filename
(filename)[source]¶ Raise an appropriate exception with details if the filename is unusable.
See https://www.backblaze.com/b2/docs/files.html for the rules.
- Parameters
filename – a proposed filename in unicode
- Returns
None if the filename is usable
-
upload_file
(upload_url, upload_auth_token, file_name, content_length, content_type, content_sha1, file_infos, data_stream)[source]¶ Upload one, small file to b2.
- Parameters
upload_url – the upload_url from b2_authorize_account
upload_auth_token – the auth token from b2_authorize_account
file_name – the name of the B2 file
content_length – number of bytes in the file
content_type – MIME type
content_sha1 – hex SHA1 of the contents of the file
file_infos – extra file info to upload
data_stream – a file like object from which the contents of the file can be read
- Returns
-
upload_part
(upload_url, upload_auth_token, part_number, content_length, content_sha1, data_stream)[source]¶
-
b2sdk.raw_api.
test_raw_api
()[source]¶ Exercise the code in B2RawApi by making each call once, just to make sure the parameters are passed in, and the result is passed back.
The goal is to be a complete test of B2RawApi, so the tests for the rest of the code can use the simulator.
Prints to stdout if things go wrong.
- Returns
0 on success, non-zero on failure
-
b2sdk.raw_api.
test_raw_api_helper
(raw_api)[source]¶ Try each of the calls to the raw api. Raise an exception if anything goes wrong.
This uses a Backblaze account that is just for this test. The account uses the free level of service, which should be enough to run this test a reasonable number of times each day. If somebody abuses the account for other things, this test will break and we’ll have to do something about it.
b2sdk.b2http
– thin http client wrapper¶
-
class
b2sdk.b2http.
ResponseContextManager
(response)[source]¶ A context manager that closes a requests.Response when done.
-
class
b2sdk.b2http.
HttpCallback
[source]¶ A callback object that does nothing. Overrides pre_request and/or post_request as desired.
-
pre_request
(method, url, headers)[source]¶ Called before processing an HTTP request.
Raises an exception if this request should not be processed. The exception raised must inherit from B2HttpCallbackPreRequestException.
-
-
class
b2sdk.b2http.
ClockSkewHook
[source]¶
-
class
b2sdk.b2http.
B2Http
(requests_module=None, install_clock_skew_hook=True)[source]¶ A wrapper for the requests module. Provides the operations needed to access B2, and handles retrying when the returned status is 503 Service Unavailable, 429 Too Many Requests, etc.
The operations supported are:
post_json_return_json
post_content_return_json
get_content
The methods that return JSON either return a Python dict or raise a subclass of B2Error. They can be used like this:
try: response_dict = b2_http.post_json_return_json(url, headers, params) ... except B2Error as e: ...
-
TIMEOUT
= 130¶
-
add_callback
(callback)[source]¶ Add a callback that inherits from HttpCallback.
- Parameters
callback (callable) – a callback to be added to a chain
-
post_content_return_json
(url, headers, data, try_count=5, post_params=None)[source]¶ Use like this:
try: response_dict = b2_http.post_content_return_json(url, headers, data) ... except B2Error as e: ...
-
post_json_return_json
(url, headers, params, try_count=5)[source]¶ Use like this:
try: response_dict = b2_http.post_json_return_json(url, headers, params) ... except B2Error as e: ...
-
get_content
(url, headers, try_count=5)[source]¶ Fetche content from a URL.
Use like this:
try: with b2_http.get_content(url, headers) as response: for byte_data in response.iter_content(chunk_size=1024): ... except B2Error as e: ...
- The response object is only guarantee to have:
headers
iter_content()
b2sdk.utils
¶
-
b2sdk.utils.
interruptible_get_result
(future)[source]¶ Wait for the result of a future in a way that can be interrupted by a KeyboardInterrupt.
This is not necessary in Python 3, but is needed for Python 2.
- Parameters
future (Future) – a future to get result of
-
b2sdk.utils.
b2_url_encode
(s)[source]¶ URL-encode a unicode string to be sent to B2 in an HTTP header.
-
b2sdk.utils.
choose_part_ranges
(content_length, minimum_part_size)[source]¶ Return a list of (offset, length) for the parts of a large file.
-
b2sdk.utils.
hex_sha1_of_stream
(input_stream, content_length)[source]¶ Return the 40-character hex SHA1 checksum of the first content_length bytes in the input stream.
-
b2sdk.utils.
validate_b2_file_name
(name)[source]¶ Raise a ValueError if the name is not a valid B2 file name.
- Parameters
name (str) – a string to check
-
b2sdk.utils.
is_file_readable
(local_path, reporter=None)[source]¶ Check if the local file has read permissions.
-
b2sdk.utils.
get_file_mtime
(local_path, rounded=True)[source]¶ Get modification time of a file in milliseconds.
-
b2sdk.utils.
set_file_mtime
(local_path, mod_time_millis, rounded=True)[source]¶ Set modification time of a file in milliseconds.
-
b2sdk.utils.
fix_windows_path_limit
(path)[source]¶ Prefix paths when running on Windows to overcome 260 character path length limit. See https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx#maxpath
-
class
b2sdk.utils.
TempDir
[source]¶ Bases:
object
Context manager that creates and destroys a temporary directory.
-
b2sdk.utils.
format_and_scale_number
(x, unit)[source]¶ Pick a good scale for representing a number and format it.
-
b2sdk.utils.
format_and_scale_fraction
(numerator, denominator, unit)[source]¶ Pick a good scale for representing a fraction, and format it.
-
b2sdk.utils.
camelcase_to_underscore
(input_)[source]¶ Convert a camel-cased string to a string with underscores.
-
class
b2sdk.utils.
B2TraceMeta
(name, bases, attrs, **kwargs)[source]¶ Bases:
logfury.v0_1.meta.DefaultTraceMeta
Trace all public method calls, except for ones with names that begin with get_.
-
class
b2sdk.utils.
B2TraceMetaAbstract
(name, bases, namespace, **kwargs)[source]¶ Bases:
logfury.v0_1.meta.DefaultTraceAbstractMeta
Default class for tracers, to be set as a metaclass for abstract base classes.
-
class
b2sdk.utils.
ConcurrentUsedAuthTokenGuard
(lock, token)[source]¶ Bases:
object
Context manager preventing two tokens being used simultaneously. Throws UploadTokenUsedConcurrently when unable to acquire a lock Sample usage:
- with ConcurrentUsedAuthTokenGuard(lock_for_token, token):
# code that uses the token exclusively
b2sdk.cache
¶
-
class
b2sdk.cache.
DummyCache
[source]¶ Bases:
b2sdk.cache.AbstractCache
A cache that does nothing.
-
class
b2sdk.cache.
InMemoryCache
[source]¶ Bases:
b2sdk.cache.AbstractCache
A cache that stores the information in memory.
-
class
b2sdk.cache.
AuthInfoCache
(info)[source]¶ Bases:
b2sdk.cache.AbstractCache
A cache that stores data persistently in StoredAccountInfo.
b2sdk.download_dest
– Download destination¶
-
class
b2sdk.download_dest.
AbstractDownloadDestination
[source]¶ Bases:
object
Interface to a destination for a downloaded file.
-
abstract
make_file_context
(file_id, file_name, content_length, content_type, content_sha1, file_info, mod_time_millis, range_=None)[source]¶ Return a context manager that yields a binary file-like object to use for writing the contents of the file.
- Parameters
file_id (str) – the B2 file ID from the headers
file_name (str) – the B2 file name from the headers
content_type (str) – the content type from the headers
content_sha1 (str) – the content sha1 from the headers (or
"none"
for large files)file_info (dict) – the user file info from the headers
mod_time_millis (int) – the desired file modification date in ms since 1970-01-01
range (None,tuple[int,int]) – starting and ending offsets of the received file contents. Usually
None
, which means that the whole file is downloaded.
- Returns
None
-
abstract
-
class
b2sdk.download_dest.
DownloadDestLocalFile
(local_file_path)[source]¶ Bases:
b2sdk.download_dest.AbstractDownloadDestination
Store a downloaded file into a local file and sets its modification time.
-
MODE
= 'wb+'¶
-
make_file_context
(file_id, file_name, content_length, content_type, content_sha1, file_info, mod_time_millis, range_=None)[source]¶ Return a context manager that yields a binary file-like object to use for writing the contents of the file.
- Parameters
file_id (str) – the B2 file ID from the headers
file_name (str) – the B2 file name from the headers
content_type (str) – the content type from the headers
content_sha1 (str) – the content sha1 from the headers (or
"none"
for large files)file_info (dict) – the user file info from the headers
mod_time_millis (int) – the desired file modification date in ms since 1970-01-01
range (None,tuple[int,int]) – starting and ending offsets of the received file contents. Usually
None
, which means that the whole file is downloaded.
- Returns
None
-
-
class
b2sdk.download_dest.
PreSeekedDownloadDest
(local_file_path, seek_target)[source]¶ Bases:
b2sdk.download_dest.DownloadDestLocalFile
Store a downloaded file into a local file and sets its modification time. Does not truncate the target file, seeks to a given offset just after opening a descriptor.
-
MODE
= 'rb+'¶
-
-
class
b2sdk.download_dest.
DownloadDestBytes
[source]¶ Bases:
b2sdk.download_dest.AbstractDownloadDestination
Store a downloaded file into bytes in memory.
-
make_file_context
(file_id, file_name, content_length, content_type, content_sha1, file_info, mod_time_millis, range_=None)[source]¶ Return a context manager that yields a binary file-like object to use for writing the contents of the file.
- Parameters
file_id (str) – the B2 file ID from the headers
file_name (str) – the B2 file name from the headers
content_type (str) – the content type from the headers
content_sha1 (str) – the content sha1 from the headers (or
"none"
for large files)file_info (dict) – the user file info from the headers
mod_time_millis (int) – the desired file modification date in ms since 1970-01-01
range (None,tuple[int,int]) – starting and ending offsets of the received file contents. Usually
None
, which means that the whole file is downloaded.
- Returns
None
-
-
class
b2sdk.download_dest.
DownloadDestProgressWrapper
(download_dest, progress_listener)[source]¶ Bases:
b2sdk.download_dest.AbstractDownloadDestination
Wrap a DownloadDestination and report progress to a ProgressListener.
-
__init__
(download_dest, progress_listener)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
make_file_context
(file_id, file_name, content_length, content_type, content_sha1, file_info, mod_time_millis, range_=None)[source]¶ Return a context manager that yields a binary file-like object to use for writing the contents of the file.
- Parameters
file_id (str) – the B2 file ID from the headers
file_name (str) – the B2 file name from the headers
content_type (str) – the content type from the headers
content_sha1 (str) – the content sha1 from the headers (or
"none"
for large files)file_info (dict) – the user file info from the headers
mod_time_millis (int) – the desired file modification date in ms since 1970-01-01
range (None,tuple[int,int]) – starting and ending offsets of the received file contents. Usually
None
, which means that the whole file is downloaded.
- Returns
None
-
b2sdk.stream.chained
ChainedStream¶
-
class
b2sdk.stream.chained.
ChainedStream
(stream_openers)[source]¶ Bases:
b2sdk.stream.base.ReadOnlyStreamMixin
,io.IOBase
Chains multiple streams in single stream, sort of what
itertools.chain
does for iterators.Cleans up buffers of underlying streams when closed.
Can be seeked to beginning (when retrying upload, for example). Closes underlying streams as soon as they reaches EOF, but clears their buffers when the chained stream is closed for underlying streams that follow
b2sdk.v1.StreamOpener
cleanup interface, for exampleb2sdk.v1.CachedBytesStreamOpener
-
__init__
(stream_openers)[source]¶ - Parameters
stream_openeres (list) – list of callables that return opened streams
-
property
stream
¶ Return currently processed stream.
-
seekable
()[source]¶ Return whether object supports random access.
If False, seek(), tell() and truncate() will raise OSError. This method may need to do a test seek().
-
readable
()[source]¶ Return whether object was opened for reading.
If False, read() will raise OSError.
-
b2sdk.stream.hashing
StreamWithHash¶
-
class
b2sdk.stream.hashing.
StreamWithHash
(stream, stream_length=None)[source]¶ Bases:
b2sdk.stream.base.ReadOnlyStreamMixin
,b2sdk.stream.wrapper.StreamWithLengthWrapper
Wrap a file-like object, calculates SHA1 while reading and appends hash at the end.
-
seek
(pos, whence=0)[source]¶ Seek to a given position in the stream.
- Parameters
pos (int) – position in the stream
-
b2sdk.stream.progress
Streams with progress reporting¶
-
class
b2sdk.stream.progress.
AbstractStreamWithProgress
(stream, progress_listener, offset=0)[source]¶ Bases:
b2sdk.stream.wrapper.StreamWrapper
Wrap a file-like object and updates a ProgressListener as data is read / written. In the abstract class, read and write methods do not update the progress - child classes shall do it.
-
__init__
(stream, progress_listener, offset=0)[source]¶ - Parameters
stream – the stream to read from or write to
progress_listener (b2sdk.v1.AbstractProgressListener) – the listener that we tell about progress
offset (int) – the starting byte offset in the file
-
-
class
b2sdk.stream.progress.
ReadingStreamWithProgress
(*args, **kwargs)[source]¶ Bases:
b2sdk.stream.progress.AbstractStreamWithProgress
Wrap a file-like object, updates progress while reading.
-
__init__
(*args, **kwargs)[source]¶ - Parameters
stream – the stream to read from or write to
progress_listener (b2sdk.v1.AbstractProgressListener) – the listener that we tell about progress
offset (int) – the starting byte offset in the file
-
-
class
b2sdk.stream.progress.
WritingStreamWithProgress
(stream, progress_listener, offset=0)[source]¶ Bases:
b2sdk.stream.progress.AbstractStreamWithProgress
Wrap a file-like object; updates progress while writing.
b2sdk.stream.range
RangeOfInputStream¶
-
class
b2sdk.stream.range.
RangeOfInputStream
(stream, offset, length)[source]¶ Bases:
b2sdk.stream.base.ReadOnlyStreamMixin
,b2sdk.stream.wrapper.StreamWithLengthWrapper
Wrap a file-like object (read only) and read the selected range of the file.
b2sdk.stream.wrapper
StreamWrapper¶
-
class
b2sdk.stream.wrapper.
StreamWrapper
(stream)[source]¶ Bases:
io.IOBase
Wrapper for a file-like object.
-
seekable
()[source]¶ Return whether object supports random access.
If False, seek(), tell() and truncate() will raise OSError. This method may need to do a test seek().
-
truncate
(size=None)[source]¶ Truncate file to size bytes.
File pointer is left unchanged. Size defaults to the current IO position as reported by tell(). Returns the new size.
-
readable
()[source]¶ Return whether object was opened for reading.
If False, read() will raise OSError.
-
read
(size=None)[source]¶ Read data from the stream.
- Parameters
size (int) – number of bytes to read
- Returns
data read from the stream
-
-
class
b2sdk.stream.wrapper.
StreamWithLengthWrapper
(stream, length=None)[source]¶ Bases:
b2sdk.stream.wrapper.StreamWrapper
Wrapper for a file-like object that supports __len__ interface
b2sdk.sync.action
¶
-
class
b2sdk.sync.action.
AbstractAction
[source]¶ Bases:
object
An action to take, such as uploading, downloading, or deleting a file. Multi-threaded tasks create a sequence of Actions which are then run by a pool of threads.
An action can depend on other actions completing. An example of this is making sure a CreateBucketAction happens before an UploadFileAction.
-
run
(bucket, reporter, dry_run=False)[source]¶ Main action routine.
- Parameters
bucket (b2sdk.bucket.Bucket) – a Bucket object
reporter – a place to report errors
dry_run (bool) – if True, perform a dry run
-
-
class
b2sdk.sync.action.
B2UploadAction
(local_full_path, relative_name, b2_file_name, mod_time_millis, size)[source]¶ Bases:
b2sdk.sync.action.AbstractAction
File uploading action.
-
class
b2sdk.sync.action.
B2HideAction
(relative_name, b2_file_name)[source]¶
-
class
b2sdk.sync.action.
B2DownloadAction
(relative_name, b2_file_name, file_id, local_full_path, mod_time_millis, file_size)[source]¶ Bases:
b2sdk.sync.action.AbstractAction
-
__init__
(relative_name, b2_file_name, file_id, local_full_path, mod_time_millis, file_size)[source]¶
-
-
class
b2sdk.sync.action.
B2DeleteAction
(relative_name, b2_file_name, file_id, note)[source]¶
-
class
b2sdk.sync.action.
LocalDeleteAction
(relative_name, full_path)[source]¶
b2sdk.sync.exception
¶
-
exception
b2sdk.sync.exception.
EnvironmentEncodingError
(filename, encoding)[source]¶ Bases:
b2sdk.exception.B2Error
Raised when a file name can not be decoded with system encoding.
-
exception
b2sdk.sync.exception.
InvalidArgument
(parameter_name, message)[source]¶ Bases:
b2sdk.exception.B2Error
Raised when one or more arguments are invalid
-
exception
b2sdk.sync.exception.
IncompleteSync
(*args, **kwargs)[source]¶ Bases:
b2sdk.exception.B2SimpleError
b2sdk.sync.file
¶
-
class
b2sdk.sync.file.
File
(name, versions)[source]¶ Bases:
object
Hold information about one file in a folder.
The name is relative to the folder in all cases.
Files that have multiple versions (which only happens in B2, not in local folders) include information about all of the versions, most recent first.
-
name
¶
-
versions
¶
-
b2sdk.sync.folder
¶
-
class
b2sdk.sync.folder.
AbstractFolder
[source]¶ Bases:
object
Interface to a folder full of files, which might be a B2 bucket, a virtual folder in a B2 bucket, or a directory on a local file system.
Files in B2 may have multiple versions, while files in local folders have just one.
-
abstract
all_files
(reporter, policies_manager=<b2sdk.sync.scan_policies.ScanPoliciesManager object>)[source]¶ Return an iterator over all of the files in the folder, in the order that B2 uses.
It also performs filtering using policies manager.
No matter what the folder separator on the local file system is, “/” is used in the returned file names.
If a file is found, but does not exist (for example due to a broken symlink or a race), reporter will be informed about each such problem.
- Parameters
reporter – a place to report errors
policies_manager – a policies manager object
-
abstract
-
b2sdk.sync.folder.
join_b2_path
(b2_dir, b2_name)[source]¶ Like os.path.join, but for B2 file names where the root directory is called ‘’.
-
class
b2sdk.sync.folder.
LocalFolder
(root)[source]¶ Bases:
b2sdk.sync.folder.AbstractFolder
Folder interface to a directory on the local machine.
-
__init__
(root)[source]¶ Initialize a new folder.
- Parameters
root (str) – path to the root of the local folder. Must be unicode.
-
all_files
(reporter, policies_manager=<b2sdk.sync.scan_policies.ScanPoliciesManager object>)[source]¶ Yield all files.
- Parameters
reporter – a place to report errors
policies_manager – a policy manager object, default is DEFAULT_SCAN_MANAGER
-
-
class
b2sdk.sync.folder.
B2Folder
(bucket_name, folder_name, api)[source]¶ Bases:
b2sdk.sync.folder.AbstractFolder
Folder interface to b2.
b2sdk.sync.folder_parser
¶
-
b2sdk.sync.folder_parser.
parse_sync_folder
(folder_name, api)[source]¶ Take either a local path, or a B2 path, and returns a Folder object for it.
B2 paths look like: b2://bucketName/path/name. The ‘//’ is optional, because the previous sync command didn’t use it.
Anything else is treated like a local folder.
- Parameters
folder_name (str) – a name of the folder, either local or remote
api (b2sdk.api.B2Api) – an API object
b2sdk.sync.policy
¶
-
class
b2sdk.sync.policy.
NewerFileSyncMode
(value)[source]¶ Bases:
enum.Enum
Mode of handling files newer on destination than on source
-
SKIP
= 101¶ skip syncing such file
-
REPLACE
= 102¶ replace the file on the destination with the (older) file on source
-
RAISE_ERROR
= 103¶ raise a non-transient error, failing the sync operation
-
-
class
b2sdk.sync.policy.
CompareVersionMode
(value)[source]¶ Bases:
enum.Enum
Mode of comparing versions of files to determine what should be synced and what shouldn’t
-
MODTIME
= 201¶ use file modification time on source filesystem
-
SIZE
= 202¶ compare using file size
-
NONE
= 203¶ compare using file name only
-
-
class
b2sdk.sync.policy.
AbstractFileSyncPolicy
(source_file, source_folder, dest_file, dest_folder, now_millis, keep_days, newer_file_mode, compare_threshold, compare_version_mode=<CompareVersionMode.MODTIME: 201>)[source]¶ Bases:
object
Abstract policy class.
-
DESTINATION_PREFIX
= NotImplemented¶
-
SOURCE_PREFIX
= NotImplemented¶
-
__init__
(source_file, source_folder, dest_file, dest_folder, now_millis, keep_days, newer_file_mode, compare_threshold, compare_version_mode=<CompareVersionMode.MODTIME: 201>)[source]¶ - Parameters
source_file (b2sdk.v1.File) – source file object
source_folder (b2sdk.v1.AbstractFolder) – source folder object
dest_file (b2sdk.v1.File) – destination file object
dest_folder (b2sdk.v1.AbstractFolder) – destination folder object
now_millis (int) – current time in milliseconds
keep_days (int) – days to keep before delete
newer_file_mode (b2sdk.v1.NEWER_FILE_MODES) – setting which determines handling for destination files newer than on the source
compare_threshold (int) – when comparing with size or time for sync
compare_version_mode (b2sdk.v1.COMPARE_VERSION_MODES) – how to compare source and destination files
-
classmethod
files_are_different
(source_file, dest_file, compare_threshold=None, compare_version_mode=<CompareVersionMode.MODTIME: 201>, newer_file_mode=<NewerFileSyncMode.RAISE_ERROR: 103>)[source]¶ Compare two files and determine if the the destination file should be replaced by the source file.
- Parameters
source_file (b2sdk.v1.File) – source file object
dest_file (b2sdk.v1.File) – destination file object
compare_threshold (int) – compare threshold when comparing by time or size
compare_version_mode (b2sdk.v1.CompareVersionMode) – source file version comparator method
newer_file_mode (b2sdk.v1.NewerFileSyncMode) – newer destination handling method
-
-
class
b2sdk.sync.policy.
DownPolicy
(source_file, source_folder, dest_file, dest_folder, now_millis, keep_days, newer_file_mode, compare_threshold, compare_version_mode=<CompareVersionMode.MODTIME: 201>)[source]¶ Bases:
b2sdk.sync.policy.AbstractFileSyncPolicy
File is synced down (from the cloud to disk).
-
DESTINATION_PREFIX
= 'local://'¶
-
SOURCE_PREFIX
= 'b2://'¶
-
-
class
b2sdk.sync.policy.
UpPolicy
(source_file, source_folder, dest_file, dest_folder, now_millis, keep_days, newer_file_mode, compare_threshold, compare_version_mode=<CompareVersionMode.MODTIME: 201>)[source]¶ Bases:
b2sdk.sync.policy.AbstractFileSyncPolicy
File is synced up (from disk the cloud).
-
DESTINATION_PREFIX
= 'b2://'¶
-
SOURCE_PREFIX
= 'local://'¶
-
-
class
b2sdk.sync.policy.
UpAndDeletePolicy
(source_file, source_folder, dest_file, dest_folder, now_millis, keep_days, newer_file_mode, compare_threshold, compare_version_mode=<CompareVersionMode.MODTIME: 201>)[source]¶ Bases:
b2sdk.sync.policy.UpPolicy
File is synced up (from disk to the cloud) and the delete flag is SET.
-
class
b2sdk.sync.policy.
UpAndKeepDaysPolicy
(source_file, source_folder, dest_file, dest_folder, now_millis, keep_days, newer_file_mode, compare_threshold, compare_version_mode=<CompareVersionMode.MODTIME: 201>)[source]¶ Bases:
b2sdk.sync.policy.UpPolicy
File is synced up (from disk to the cloud) and the keepDays flag is SET.
-
class
b2sdk.sync.policy.
DownAndDeletePolicy
(source_file, source_folder, dest_file, dest_folder, now_millis, keep_days, newer_file_mode, compare_threshold, compare_version_mode=<CompareVersionMode.MODTIME: 201>)[source]¶ Bases:
b2sdk.sync.policy.DownPolicy
File is synced down (from the cloud to disk) and the delete flag is SET.
-
class
b2sdk.sync.policy.
DownAndKeepDaysPolicy
(source_file, source_folder, dest_file, dest_folder, now_millis, keep_days, newer_file_mode, compare_threshold, compare_version_mode=<CompareVersionMode.MODTIME: 201>)[source]¶ Bases:
b2sdk.sync.policy.DownPolicy
File is synced down (from the cloud to disk) and the keepDays flag is SET.
-
b2sdk.sync.policy.
make_b2_delete_note
(version, index, transferred)[source]¶ Create a note message for delete action.
- Parameters
version (b2sdk.v1.FileVersionInfo) – an object which contains file version info
index (int) – file version index
transferred (bool) – if True, file has been transferred, False otherwise
-
b2sdk.sync.policy.
make_b2_delete_actions
(source_file, dest_file, dest_folder, transferred)[source]¶ Create the actions to delete files stored on B2, which are not present locally.
- Parameters
source_file (b2sdk.v1.File) – source file object
dest_file (b2sdk.v1.File) – destination file object
dest_folder (b2sdk.v1.AbstractFolder) – destination folder
transferred (bool) – if True, file has been transferred, False otherwise
-
b2sdk.sync.policy.
make_b2_keep_days_actions
(source_file, dest_file, dest_folder, transferred, keep_days, now_millis)[source]¶ Create the actions to hide or delete existing versions of a file stored in b2.
When keepDays is set, all files that were visible any time from keepDays ago until now must be kept. If versions were uploaded 5 days ago, 15 days ago, and 25 days ago, and the keepDays is 10, only the 25-day old version can be deleted. The 15 day-old version was visible 10 days ago.
- Parameters
source_file (b2sdk.v1.File) – source file object
dest_file (b2sdk.v1.File) – destination file object
dest_folder (b2sdk.v1.AbstractFolder) – destination folder object
transferred (bool) – if True, file has been transferred, False otherwise
keep_days (int) – how many days to keep a file
now_millis (int) – current time in milliseconds
b2sdk.sync.policy_manager
¶
-
class
b2sdk.sync.policy_manager.
SyncPolicyManager
[source]¶ Bases:
object
Policy manager; implement a logic to get a correct policy class and create a policy object based on various parameters.
-
get_policy
(sync_type, source_file, source_folder, dest_file, dest_folder, now_millis, delete, keep_days, newer_file_mode, compare_threshold, compare_version_mode)[source]¶ Return a policy object.
- Parameters
sync_type (str) – synchronization type
source_file (str) – source file name
source_folder (str) – a source folder path
dest_file (str) – destination file name
dest_folder (str) – a destination folder path
now_millis (int) – current time in milliseconds
delete (bool) – delete policy
keep_days (int) – keep for days policy
newer_file_mode (b2sdk.v1.NewerFileSyncMode) – setting which determines handling for destination files newer than on the source
compare_threshold (int) – difference between file modification time or file size
compare_version_mode (b2sdk.v1.CompareVersionMode) – setting which determines how to compare source and destination files
- Returns
a policy object
-
b2sdk.sync.scan_policies
¶
-
class
b2sdk.sync.scan_policies.
RegexSet
(regex_iterable)[source]¶ Bases:
object
Hold a (possibly empty) set of regular expressions and know how to check whether a string matches any of them.
-
b2sdk.sync.scan_policies.
convert_dir_regex_to_dir_prefix_regex
(dir_regex)[source]¶ The patterns used to match directory names (and file names) are allowed to match a prefix of the name. This ‘feature’ was unintentional, but is being retained for compatibility.
This means that a regex that matches a directory name can’t be used directly to match against a file name and test whether the file should be excluded because it matches the directory.
The pattern ‘photos’ will match directory names ‘photos’ and ‘photos2’, and should exclude files ‘photos/kitten.jpg’, and ‘photos2/puppy.jpg’. It should not exclude ‘photos.txt’, because there is no directory name that matches.
On the other hand, the pattern ‘photos$’ should match ‘photos/kitten.jpg’, but not ‘photos2/puppy.jpg’, nor ‘photos.txt’
If the original regex is valid, there are only two cases to consider: either the regex ends in ‘$’ or does not.
- Parameters
dir_regex (str) – a regular expression string or literal
-
class
b2sdk.sync.scan_policies.
IntegerRange
(begin, end)[source]¶ Bases:
object
Hold a range of two integers. If the range value is None, it indicates that the value should be treated as -Inf (for begin) or +Inf (for end).
-
class
b2sdk.sync.scan_policies.
ScanPoliciesManager
(exclude_dir_regexes=(), exclude_file_regexes=(), include_file_regexes=(), exclude_all_symlinks=False, exclude_modified_before=None, exclude_modified_after=None)[source]¶ Bases:
object
Policy object used when scanning folders for syncing, used to decide which files to include in the list of files to be synced.
Code that scans through files should at least use should_exclude_file() to decide whether each file should be included; it will check include/exclude patterns for file names, as well as patterns for excluding directories.
Code that scans may optionally use should_exclude_directory() to test whether it can skip a directory completely and not bother listing the files and sub-directories in it.
-
__init__
(exclude_dir_regexes=(), exclude_file_regexes=(), include_file_regexes=(), exclude_all_symlinks=False, exclude_modified_before=None, exclude_modified_after=None)[source]¶ - Parameters
exclude_dir_regexes (tuple) – a tuple of regexes to exclude directories
exclude_file_regexes (tuple) – a tuple of regexes to exclude files
include_file_regexes (tuple) – a tuple of regexes to include files
exclude_all_symlinks (bool) – if True, exclude all symlinks
exclude_modified_before (int, optional) – optionally exclude file versions modified before (in millis)
exclude_modified_after (int, optional) – optionally exclude file versions modified after (in millis)
-
should_exclude_file
(file_path)[source]¶ Given the full path of a file, decide if it should be excluded from the scan.
-
should_exclude_file_version
(file_version)[source]¶ Given the modification time of a file version, decide if it should be excluded from the scan.
- Parameters
file_version – the file version object
- Type
b2sdk.v1.FileVersion
- Returns
True if excluded.
- Return type
-
should_exclude_directory
(dir_path)[source]¶ Given the full path of a directory, decide if all of the files in it should be excluded from the scan.
- Parameters
dir_path (str) – the path of the directory, relative to the root directory being scanned. The path will never end in ‘/’.
- Returns
True if excluded.
-
b2sdk.sync.sync
¶
-
b2sdk.sync.sync.
next_or_none
(iterator)[source]¶ Return the next item from the iterator, or None if there are no more.
-
b2sdk.sync.sync.
zip_folders
(folder_a, folder_b, reporter, policies_manager=<b2sdk.sync.scan_policies.ScanPoliciesManager object>)[source]¶ Iterate over all of the files in the union of two folders, matching file names.
Each item is a pair (file_a, file_b) with the corresponding file in both folders. Either file (but not both) will be None if the file is in only one folder.
- Parameters
folder_a (b2sdk.sync.folder.AbstractFolder) – first folder object.
folder_b (b2sdk.sync.folder.AbstractFolder) – second folder object.
reporter – reporter object
policies_manager – policies manager object
- Returns
yields two element tuples
-
b2sdk.sync.sync.
count_files
(local_folder, reporter)[source]¶ Count all of the files in a local folder.
- Parameters
local_folder (b2sdk.sync.folder.AbstractFolder) – a folder object.
reporter – reporter object
-
class
b2sdk.sync.sync.
KeepOrDeleteMode
(value)[source]¶ Bases:
enum.Enum
Mode of dealing with old versions of files on the destination
-
DELETE
= 301¶ delete the old version as soon as the new one has been uploaded
-
KEEP_BEFORE_DELETE
= 302¶ keep the old versions of the file for a configurable number of days before deleting them, always keeping the newest version
-
NO_DELETE
= 303¶ keep old versions of the file, do not delete anything
-
-
class
b2sdk.sync.sync.
Synchronizer
(max_workers, policies_manager=<b2sdk.sync.scan_policies.ScanPoliciesManager object>, dry_run=False, allow_empty_source=False, newer_file_mode=<NewerFileSyncMode.RAISE_ERROR: 103>, keep_days_or_delete=<KeepOrDeleteMode.NO_DELETE: 303>, compare_version_mode=<CompareVersionMode.MODTIME: 201>, compare_threshold=None, keep_days=None)[source]¶ Bases:
object
-
__init__
(max_workers, policies_manager=<b2sdk.sync.scan_policies.ScanPoliciesManager object>, dry_run=False, allow_empty_source=False, newer_file_mode=<NewerFileSyncMode.RAISE_ERROR: 103>, keep_days_or_delete=<KeepOrDeleteMode.NO_DELETE: 303>, compare_version_mode=<CompareVersionMode.MODTIME: 201>, compare_threshold=None, keep_days=None)[source]¶ Initialize synchronizer class and validate arguments
- Parameters
max_workers (int) – max number of workers
policies_manager – policies manager object
dry_run (bool) – test mode, does not actually transfer/delete when enabled
allow_empty_source (bool) – if True, do not check whether source folder is empty
newer_file_mode (b2sdk.v1.NewerFileSyncMode) – setting which determines handling for destination files newer than on the source
keep_days_or_delete (b2sdk.v1.KeepOrDeleteMode) – setting which determines if we should delete or not delete or keep for keep_days
compare_version_mode (b2sdk.v1.CompareVersionMode) – how to compare the source and destination files to find new ones
compare_threshold (int) – should be greater than 0, default is 0
keep_days (int) – if keep_days_or_delete is b2sdk.v1.KeepOrDeleteMode.KEEP_BEFORE_DELETE, then this should be greater than 0
-
sync_folders
(source_folder, dest_folder, now_millis, reporter)[source]¶ Syncs two folders. Always ensures that every file in the source is also in the destination. Deletes any file versions in the destination older than history_days.
- Parameters
source_folder (b2sdk.sync.folder.AbstractFolder) – source folder object
dest_folder (b2sdk.sync.folder.AbstractFolder) – destination folder object
now_millis (int) – current time in milliseconds
reporter (b2sdk.sync.report.SyncReport,None) – progress reporter
-
make_folder_sync_actions
(source_folder, dest_folder, now_millis, reporter, policies_manager=<b2sdk.sync.scan_policies.ScanPoliciesManager object>)[source]¶ Yield a sequence of actions that will sync the destination folder to the source folder.
- Parameters
source_folder (b2sdk.v1.AbstractFolder) – source folder object
dest_folder (b2sdk.v1.AbstractFolder) – destination folder object
now_millis (int) – current time in milliseconds
reporter (b2sdk.v1.SyncReport) – reporter object
policies_manager – policies manager object
-
make_file_sync_actions
(sync_type, source_file, dest_file, source_folder, dest_folder, now_millis)[source]¶ Yields the sequence of actions needed to sync the two files
- Parameters
sync_type (str) – synchronization type
source_file (b2sdk.v1.File) – source file object
dest_file (b2sdk.v1.File) – destination file object
source_folder (b2sdk.v1.AbstractFolder) – a source folder object
dest_folder (b2sdk.v1.AbstractFolder) – a destination folder object
now_millis (int) – current time in milliseconds
-
b2sdk.transfer.inbound.downloader.abstract
– Downloader base class¶
-
class
b2sdk.transfer.inbound.downloader.abstract.
AbstractDownloader
(force_chunk_size=None, min_chunk_size=None, max_chunk_size=None)[source]¶ Bases:
object
-
__init__
(force_chunk_size=None, min_chunk_size=None, max_chunk_size=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
b2sdk.transfer.inbound.downloader.parallel
– ParallelTransferer¶
-
class
b2sdk.transfer.inbound.downloader.parallel.
ParallelDownloader
(max_streams, min_part_size, *args, **kwargs)[source]¶ Bases:
b2sdk.transfer.inbound.downloader.abstract.AbstractDownloader
-
FINISH_HASHING_BUFFER_SIZE
= 1048576¶
-
__init__
(max_streams, min_part_size, *args, **kwargs)[source]¶ - Parameters
max_streams – maximum number of simultaneous streams
min_part_size – minimum amount of data a single stream will retrieve, in bytes
-
is_suitable
(metadata, progress_listener)[source]¶ Analyze metadata (possibly against options passed earlier to constructor to find out whether the given download request should be handled by this downloader).
-
download
(file, response, metadata, session)[source]¶ Download a file from given url using parallel download sessions and stores it in the given download_destination.
- Parameters
file – an opened file-like object to write to
response – the response of the first request made to the cloud service with download intent
- Returns
-
-
class
b2sdk.transfer.inbound.downloader.parallel.
WriterThread
(file, max_queue_depth)[source]¶ Bases:
threading.Thread
A thread responsible for keeping a queue of data chunks to write to a file-like object and for actually writing them down. Since a single thread is responsible for synchronization of the writes, we avoid a lot of issues between userspace and kernelspace that would normally require flushing buffers between the switches of the writer. That would kill performance and not synchronizing would cause data corruption (probably we’d end up with a file with unexpected blocks of zeros preceding the range of the writer that comes second and writes further into the file).
The object of this class is also responsible for backpressure: if items are added to the queue faster than they can be written (see GCP VMs with standard PD storage with faster CPU and network than local storage, https://github.com/Backblaze/B2_Command_Line_Tool/issues/595), then
obj.queue.put(item)
will block, slowing down the producer.The recommended minimum value of
max_queue_depth
is equal to the amount of producer threads, so that if all producers submit a part at the exact same time (right after network issue, for example, or just after starting the read), they can continue their work without blocking. The writer should be able to store at least one data chunk before a new one is retrieved, but it is not guaranteed.Therefore, the recommended value of
max_queue_depth
is higher - a double of the amount of producers, so that spikes on either end (many producers submit at the same time / consumer has a latency spike) can be accommodated without sacrificing performance.Please note that a size of the chunk and the queue depth impact the memory footprint. In a default setting as of writing this, that might be 10 downloads, 8 producers, 1MB buffers, 2 buffers each = 8*2*10 = 160 MB (+ python buffers, operating system etc).
-
__init__
(file, max_queue_depth)[source]¶ This constructor should always be called with keyword arguments. Arguments are:
group should be None; reserved for future extension when a ThreadGroup class is implemented.
target is the callable object to be invoked by the run() method. Defaults to None, meaning nothing is called.
name is the thread name. By default, a unique name is constructed of the form “Thread-N” where N is a small decimal number.
args is the argument tuple for the target invocation. Defaults to ().
kwargs is a dictionary of keyword arguments for the target invocation. Defaults to {}.
If a subclass overrides the constructor, it must make sure to invoke the base class constructor (Thread.__init__()) before doing anything else to the thread.
-
run
()[source]¶ Method representing the thread’s activity.
You may override this method in a subclass. The standard run() method invokes the callable object passed to the object’s constructor as the target argument, if any, with sequential and keyword arguments taken from the args and kwargs arguments, respectively.
-
-
class
b2sdk.transfer.inbound.downloader.parallel.
AbstractDownloaderThread
(session, writer, part_to_download, chunk_size)[source]¶ Bases:
threading.Thread
-
__init__
(session, writer, part_to_download, chunk_size)[source]¶ - Parameters
session – raw_api wrapper
writer – where to write data
part_to_download – PartToDownload object
chunk_size – internal buffer size to use for writing and hashing
-
abstract
run
()[source]¶ Method representing the thread’s activity.
You may override this method in a subclass. The standard run() method invokes the callable object passed to the object’s constructor as the target argument, if any, with sequential and keyword arguments taken from the args and kwargs arguments, respectively.
-
-
class
b2sdk.transfer.inbound.downloader.parallel.
FirstPartDownloaderThread
(response, hasher, *args, **kwargs)[source]¶ Bases:
b2sdk.transfer.inbound.downloader.parallel.AbstractDownloaderThread
-
__init__
(response, hasher, *args, **kwargs)[source]¶ - Parameters
response – response of the original GET call
hasher – hasher object to feed to as the stream is written
-
run
()[source]¶ Method representing the thread’s activity.
You may override this method in a subclass. The standard run() method invokes the callable object passed to the object’s constructor as the target argument, if any, with sequential and keyword arguments taken from the args and kwargs arguments, respectively.
-
-
class
b2sdk.transfer.inbound.downloader.parallel.
NonHashingDownloaderThread
(url, *args, **kwargs)[source]¶ Bases:
b2sdk.transfer.inbound.downloader.parallel.AbstractDownloaderThread
-
run
()[source]¶ Method representing the thread’s activity.
You may override this method in a subclass. The standard run() method invokes the callable object passed to the object’s constructor as the target argument, if any, with sequential and keyword arguments taken from the args and kwargs arguments, respectively.
-
b2sdk.transfer.inbound.downloader.range
– transfer range toolkit¶
-
class
b2sdk.transfer.inbound.downloader.range.
Range
(start, end)[source]¶ Bases:
object
HTTP ranges use an inclusive index at the end.
-
classmethod
from_header
(raw_range_header)[source]¶ Factory method which returns an object constructed from Range http header.
raw_range_header example: ‘bytes=0-11’
-
classmethod
b2sdk.transfer.inbound.downloader.simple
– SimpleDownloader¶
-
class
b2sdk.transfer.inbound.downloader.simple.
SimpleDownloader
(*args, **kwargs)[source]¶ Bases:
b2sdk.transfer.inbound.downloader.abstract.AbstractDownloader
b2sdk.transfer.inbound.download_manager
– Manager of downloaders¶
-
class
b2sdk.transfer.inbound.download_manager.
DownloadManager
(services)[source]¶ Bases:
object
Handle complex actions around downloads to free raw_api from that responsibility.
-
DEFAULT_MAX_STREAMS
= 8¶
-
DEFAULT_MIN_PART_SIZE
= 104857600¶
-
MIN_CHUNK_SIZE
= 8192¶
-
MAX_CHUNK_SIZE
= 1048576¶
-
__init__
(services)[source]¶ Initialize the DownloadManager using the given services object.
- Parameters
services (b2sdk.v1.Services) –
-
download_file_from_url
(url, download_dest, progress_listener=None, range_=None)[source]¶ - Parameters
url – url from which the file should be downloaded
download_dest – where to put the file when it is downloaded
progress_listener – where to notify about progress downloading
range – 2-element tuple containing data of http Range header
-
b2sdk.transfer.inbound.file_metadata
¶
-
class
b2sdk.transfer.inbound.file_metadata.
FileMetadata
(file_id, file_name, content_type, content_length, content_sha1, file_info)[source]¶ Bases:
object
Hold information about a file which is being downloaded.
-
__init__
(file_id, file_name, content_type, content_length, content_sha1, file_info)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
file_id
¶
-
file_name
¶
-
content_type
¶
-
content_length
¶
-
content_sha1
¶
-
file_info
¶
-
b2sdk.transfer.outbound.upload_source
¶
-
class
b2sdk.transfer.outbound.upload_source.
AbstractUploadSource
[source]¶ Bases:
b2sdk.transfer.outbound.outbound_source.OutboundTransferSource
The source of data for uploading to b2.
-
class
b2sdk.transfer.outbound.upload_source.
UploadSourceBytes
(data_bytes, content_sha1=None)[source]¶ Bases:
b2sdk.transfer.outbound.upload_source.AbstractUploadSource
-
__init__
(data_bytes, content_sha1=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
-
class
b2sdk.transfer.outbound.upload_source.
UploadSourceLocalFile
(local_path, content_sha1=None)[source]¶ Bases:
b2sdk.transfer.outbound.upload_source.AbstractUploadSource
-
__init__
(local_path, content_sha1=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
-
class
b2sdk.transfer.outbound.upload_source.
UploadSourceLocalFileRange
(local_path, content_sha1=None, offset=0, length=None)[source]¶ Bases:
b2sdk.transfer.outbound.upload_source.UploadSourceLocalFile
-
class
b2sdk.transfer.outbound.upload_source.
UploadSourceStream
(stream_opener, stream_length=None, stream_sha1=None)[source]¶ Bases:
b2sdk.transfer.outbound.upload_source.AbstractUploadSource
-
__init__
(stream_opener, stream_length=None, stream_sha1=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
-
class
b2sdk.transfer.outbound.upload_source.
UploadSourceStreamRange
(stream_opener, offset, stream_length, stream_sha1=None)[source]¶ Bases:
b2sdk.transfer.outbound.upload_source.UploadSourceStream
b2sdk.raw_simulator
– B2 raw api simulator¶
-
b2sdk.raw_simulator.
get_bytes_range
(data_bytes, bytes_range)[source]¶ Slice bytes array using bytes range
-
class
b2sdk.raw_simulator.
KeySimulator
(account_id, name, application_key_id, key, capabilities, expiration_timestamp_or_none, bucket_id_or_none, bucket_name_or_none, name_prefix_or_none)[source]¶ Bases:
object
Hold information about one application key, which can be either a master application key, or one created with create_key().
-
__init__
(account_id, name, application_key_id, key, capabilities, expiration_timestamp_or_none, bucket_id_or_none, bucket_name_or_none, name_prefix_or_none)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
-
class
b2sdk.raw_simulator.
PartSimulator
(file_id, part_number, content_length, content_sha1, part_data)[source]¶ Bases:
object
-
class
b2sdk.raw_simulator.
FileSimulator
(account_id, bucket_id, file_id, action, name, content_type, content_sha1, file_info, data_bytes, upload_timestamp, range_=None)[source]¶ Bases:
object
One of three: an unfinished large file, a finished file, or a deletion marker.
-
__init__
(account_id, bucket_id, file_id, action, name, content_type, content_sha1, file_info, data_bytes, upload_timestamp, range_=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
-
class
b2sdk.raw_simulator.
FakeRequest
(url, headers)¶ Bases:
tuple
-
property
headers
¶ Alias for field number 1
-
property
url
¶ Alias for field number 0
-
property
-
class
b2sdk.raw_simulator.
FakeResponse
(file_sim, url, range_=None)[source]¶ Bases:
object
-
__init__
(file_sim, url, range_=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
property
request
¶
-
-
class
b2sdk.raw_simulator.
BucketSimulator
(account_id, bucket_id, bucket_name, bucket_type, bucket_info=None, cors_rules=None, lifecycle_rules=None, options_set=None)[source]¶ Bases:
object
-
FIRST_FILE_NUMBER
= 9999¶
-
FIRST_FILE_ID
= '9999'¶
-
FILE_SIMULATOR_CLASS
¶ alias of
b2sdk.raw_simulator.FileSimulator
-
RESPONSE_CLASS
¶ alias of
b2sdk.raw_simulator.FakeResponse
-
__init__
(account_id, bucket_id, bucket_name, bucket_type, bucket_info=None, cors_rules=None, lifecycle_rules=None, options_set=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
copy_file
(file_id, new_file_name, bytes_range=None, metadata_directive=None, content_type=None, file_info=None, destination_bucket_id=None)[source]¶
-
list_file_versions
(start_file_name=None, start_file_id=None, max_file_count=None, prefix=None)[source]¶
-
update_bucket
(bucket_type=None, bucket_info=None, cors_rules=None, lifecycle_rules=None, if_revision_is=None)[source]¶
-
-
class
b2sdk.raw_simulator.
RawSimulator
[source]¶ Bases:
b2sdk.raw_api.AbstractRawApi
Implement the same interface as B2RawApi by simulating all of the calls and keeping state in memory.
The intended use for this class is for unit tests that test things built on top of B2RawApi.
-
BUCKET_SIMULATOR_CLASS
¶ alias of
b2sdk.raw_simulator.BucketSimulator
-
API_URL
= 'http://api.example.com'¶
-
DOWNLOAD_URL
= 'http://download.example.com'¶
-
MIN_PART_SIZE
= 200¶
-
MAX_DURATION_IN_SECONDS
= 86400000¶
-
UPLOAD_PART_MATCHER
= re.compile('https://upload.example.com/part/([^/]*)')¶
-
UPLOAD_URL_MATCHER
= re.compile('https://upload.example.com/([^/]*)/([^/]*)')¶
-
DOWNLOAD_URL_MATCHER
= re.compile('http://download.example.com(?:/b2api/v[0-9]+/b2_download_file_by_id\\?fileId=(?P<file_id>[^/]+)|/file/(?P<bucket_name>[^/]+)/(?P<file_name>.+))$')¶
-
expire_auth_token
(auth_token)[source]¶ Simulate the auth token expiring.
The next call that tries to use this auth token will get an auth_token_expired error.
-
set_upload_errors
(errors)[source]¶ Store a sequence of exceptions to raise on upload. Each one will be raised in turn, until they are all gone. Then the next upload will succeed.
-
create_bucket
(api_url, account_auth_token, account_id, bucket_name, bucket_type, bucket_info=None, cors_rules=None, lifecycle_rules=None)[source]¶
-
create_key
(api_url, account_auth_token, account_id, capabilities, key_name, valid_duration_seconds, bucket_id, name_prefix)[source]¶
-
copy_file
(api_url, account_auth_token, source_file_id, new_file_name, bytes_range=None, metadata_directive=None, content_type=None, file_info=None, destination_bucket_id=None)[source]¶
-
copy_part
(api_url, account_auth_token, source_file_id, large_file_id, part_number, bytes_range=None)[source]¶
-
list_file_names
(api_url, account_auth_token, bucket_id, start_file_name=None, max_file_count=None, prefix=None)[source]¶
-
list_file_versions
(api_url, account_auth_token, bucket_id, start_file_name=None, start_file_id=None, max_file_count=None, prefix=None)[source]¶
-
list_keys
(api_url, account_auth_token, account_id, max_key_count=1000, start_application_key_id=None)[source]¶
-
list_unfinished_large_files
(api_url, account_auth_token, bucket_id, start_file_id=None, max_file_count=None, prefix=None)[source]¶
-
start_large_file
(api_url, account_auth_token, bucket_id, file_name, content_type, file_info)[source]¶
-
update_bucket
(api_url, account_auth_token, account_id, bucket_id, bucket_type=None, bucket_info=None, cors_rules=None, lifecycle_rules=None, if_revision_is=None)[source]¶
-
Contributors Guide¶
We encourage outside contributors to perform changes on our codebase. Many such changes have been merged already. In order to make it easier to contribute, core developers of this project:
provide guidance (through the issue reporting system)
provide tool assisted code review (through the Pull Request system)
maintain a set of integration tests (run with a production cloud)
maintain a set of (well over a hundred) unit tests
automatically run unit tests on 13 versions of python (including
osx
andpypy
)format the code automatically using yapf
use static code analysis to find subtle/potential issues with maintainability
maintain other Continous Integration tools (coverage tracker)
We marked the places in the code which are significantly less intuitive than others in a special way. To find them occurrences, use git grep '*magic*'
.
To install a development environment, please follow this link.
To test in multiple python virtual environments, set the enviroment variable PYTHON_VIRTUAL_ENVS
to be a space-separated list of their root directories. When set, the makefile will run the
unit tests in each of the environments.
Before checking in, use the pre-commit.sh
script to check code formatting, run
unit tests, run integration tests etc.
The integration tests need a file in your home directory called .b2_auth
that contains two lines with nothing on them but your account ID and application key:
accountId
applicationKey