About Cloud Storage objects

This page describes objects, a resource in Cloud Storage. For a general overview of how Cloud Storage works, see the Cloud Storage product overview.

Objects

Objects are the individual pieces of data that you store in Cloud Storage. There is no limit on the number of objects that you can create in a bucket.

Objects have two components: object data and object metadata. Object data is typically a file that you want to store in Cloud Storage and is completely opaque to Cloud Storage. Object metadata is a collection of name-value pairs that describe various object qualities.

Two important pieces of object metadata common to all objects are the object's name and its generation number. When you add an object to a Cloud Storage bucket, you specify the object name and Cloud Storage assigns the generation number. Together, the name and generation uniquely identify the object within that bucket.

You can use access control lists (ACLs) to control access to individual objects. You can also use Identity and Access Management (IAM) to control access to all the objects within a bucket or a managed folder.

Naming considerations

When naming objects in Cloud Storage, it's essential to adhere to specific requirements to ensure compatibility and prevent errors. These requirements apply to both flat namespace buckets and hierarchical namespace enabled buckets.

General requirements

  • Object names can contain any sequence of valid Unicode characters.
  • Object names cannot contain Carriage Return or Line Feed characters.
  • Object names cannot start with .well-known/acme-challenge/.
  • Objects cannot be named . or ...

Namespace-specific object size limits

The maximum size of an object name varies depending on the namespace of the bucket:

  • Object name size in a flat namespace bucket: 1-1024 bytes when UTF-8 encoded.
  • Object name size in buckets enabled with Hierarchical namespace: Object names can be divided into two parts:

    • Folder name: The name of the folder in which the object resides. The maximum size for the folder name is 512 bytes when UTF-8 encoded.
    • Base name: The name of the object which resides in the folder. The maximum size for the base name is 512 bytes when UTF-8 encoded.

    For example, the path my-folder/my-object.txt represents an object with a base name as my-object.txt stored within a folder named my-folder/.

Recommendations

It is strongly recommended that you avoid the following in your object names:

  • Control characters that are illegal in XML 1.0 (#x7F–#x84 and #x86–#x9F): these characters cause XML listing issues when you try to list your objects.
  • The # character: Google Cloud CLI commands interpret object names ending with #<numeric string> as version identifiers, so including # in object names can make it difficult or impossible to perform operations on such versioned objects using the gcloud CLI.
  • The [, ], *, or ? characters: Google Cloud CLI commands interpret these characters as wildcards, so including them in object names can make it difficult or impossible to perform wildcard operations. Additionally, * and ? are not valid characters for file names in Windows.
  • The :, ", <, >, or | characters: These are not valid characters for file names in Windows, so attempts to download an object that uses such characters in its name to a Windows file fail unless the download method includes renaming the resulting Windows file. The / character, while also not a valid character for file names in Windows, is typically OK to use in object names for mimicking a directory structure; tools such as the Google Cloud CLI automatically convert the character to \ when downloading to a Windows environment.
  • Sensitive or personally identifiable information (PII): object names are more broadly visible than object data. For example, object names appear in URLs for the object and when listing objects in a bucket.

Existing objects cannot be directly renamed, but you can indirectly rename an object by copying and deleting the original object.

Object namespace

You can store objects in the following namespaces:

Flat namespace

Buckets with a flat namespace store objects in a flat structure without a hierarchy, meaning that there are no directories or folders.

For convenience, there are several ways that objects are treated as if they were stored in a folder hierarchy:

For example, if you create an object named folder1/file.txt in the bucket your-bucket, the path to the object is your-bucket/folder1/file.txt, and Cloud Storage has no folder named folder1 stored within it. From the perspective of Cloud Storage, the string folder1/ is part of the object's name.

However, because the object has a / in its name, some tools implement the appearance of folders. For example, when using the Google Cloud console, you would navigate to the object folder1/file1.txt as if it were an object named file1.txt in a folder named folder1. Similarly, you could create a managed folder named folder1, and then file1.txt would be subject to the access policy set by this managed folder.

Note that because objects reside in a flat namespace, deeply nested, directory-like structures don't have the performance that a native filesystem has when listing deeply nested sub-directories.

See Request rate best practices for recommendations on how to optimize performance by avoiding sequential names during large-scale uploads. Objects uploaded with sequential names are likely to reach the same backend server and constrain performance.

Simulated folders

In order to help you organize objects in your Cloud Storage buckets, some tools simulate folders, and both the JSON and XML APIs have capabilities that let you design your own naming scheme to simulate folders. Click the following tabs to see how different tools handle simulated folders.

The Google Cloud console creates a visual representation of folders that resembles a local file browser.

In the Google Cloud console, you can create an empty folder in a bucket, or upload an existing folder.

When you upload an existing folder, the name of the folder becomes part of the path for all the objects contained in the folder. Any subfolders and the objects they contain are also included in the upload.

To create a folder:

  1. In the Google Cloud console, go to the Cloud Storage Buckets page.

    Go to Buckets

  2. Navigate to the bucket.

  3. Click Create folder to create an empty new folder, or Upload folder to upload an existing folder.

Cloud Storage CLIs simulate the typical command-line directory experience using a variety of rules.

To achieve the illusion of a hierarchical file tree, the gcloud CLI applies the following rules to determine whether the destination URL in a command should be treated as an object name or a folder:

  1. If the destination URL ends with a / character, gcloud CLI commands treat the destination URL as a folder. For example, consider the following command, where your-file is the name of a file:

    gcloud storage cp your-file gs://your-bucket/abc/

    As a result of this command, Cloud Storage creates an object named abc/your-file in the bucket your-bucket.

  2. If you copy multiple source files to a destination URL, either by using the --recursive flag or a wildcard such as **, the gcloud CLI treats the destination URL as a folder. For example, consider the following command where top-dir is a folder containing files such as file1 and file2:

    gcloud storage cp top-dir gs://your-bucket/abc --recursive

    As a result of this command, Cloud Storage creates the objects abc/top-dir/file1 and abc/top-dir/file2 in the bucket your-bucket.

  3. If neither of these rules apply, the gcloud CLI checks the objects in the bucket to determine if the destination URL is an object name or a folder. For example, consider the following command where your-file is the name of a file:

    gcloud storage cp your-file gs://your-bucket/abc

    The gcloud CLI makes an object listing request for your-bucket, using the / delimiter and prefix=abc, to determine whether there are objects in your-bucket whose path starts with abc/. If so, the gcloud CLI treats abc/ as a folder name, and the command creates the object abc/your-file in the bucket your-bucket. Otherwise, the gcloud CLI creates the object abc in your-bucket.

This rule-based approach differs from the way many tools work, which create 0-byte objects to mark the existence of folders. The gcloud CLI understands several conventions used by such tools, such as the convention of adding _$folder$ to the end of the name of the 0-byte object, but it doesn't require such marker objects to implement naming behavior consistent with UNIX commands.

In addition to these rules, how gcloud CLI treats source files depends on whether or not you use the --recursive flag. If you use the flag, the gcloud CLI constructs object names to mirror the source directory structure, starting at the point of recursive processing. For example, consider the following command where home/top-dir is a folder containing files such as file1 and sub-dir/file2:

gcloud storage cp home/top-dir gs://your-bucket --recursive

As a result of this command, Cloud Storage creates the objects top-dir/file1 and top-dir/sub-dir/file2 in the bucket your-bucket.

In contrast, copying without the --recursive flag, even if multiple files are copied due to the presence of a wildcard such as **, results in objects named by the final path component of the source files. For example, assuming again that home/top-dir is a folder that contains files such as file1 and sub-dir/file2, then the command:

gcloud storage cp home/top-dir/** gs://your-bucket

creates an object named file1 and an object named file2 in the bucket your-bucket.

Retries and naming

When the gcloud CLI retries an interrupted request, you might encounter a problem where the first attempt copies a subset of files, and subsequent attempts encounter an already existing destination folder, which causes your objects to be named incorrectly.

For example, consider the following command, where there are subfolders under your-dir/ such as dir1 and dir2, and both subfolders contain the file abc:

gcloud storage cp ./your-dir gs://your-bucket/new --recursive

If the path gs://your-bucket/new doesn't exist yet, the gcloud CLI creates the following objects on the first successful attempt:

new/dir1/abc
new/dir2/abc

However, on the next successful attempt of the same command, the gcloud CLI creates the following objects:

new/your-dir/dir1/abc
new/your-dir/dir2/abc

To make the gcloud CLI work consistently on every attempt, try the following:

  1. Add a slash to the end of the destination URL so the gcloud CLI always treats it as a folder.

  2. Use gcloud storage rsync. Since rsync doesn't use the Unix cp-defined folder naming rules, it works consistently whether the destination subfolder exists or not.

Additional notes

  • You cannot create a zero-byte object to mimic an empty folder using the gcloud CLI.

  • When downloading to a local file system, gcloud CLI skips objects whose name end with a / character, because creating a file that ends with a / is not allowed on Linux and macOS.

  • If you use scripts to build file paths by combining subpaths, note that because / is just a character that happens to be in the name of the object, the CLIs interpret gs://your-bucket/folder/ to be a different object from gs://your-bucket/folder.

Folders don't exist in the JSON API. You can narrow down the objects you list and simulate folders by using the prefix and delimiter query parameters.

For example, to list all the objects in the bucket my-bucket with the prefix folder/subfolder/, make an object listing request using this URL:

"https://storage.googleapis.com/storage/v1/b/my-bucket/o?prefix=folder/subfolder/"

Folders don't exist in the XML API, You can narrow down the objects you list and simulate folders by using the prefix and delimiter query parameters.

For example, to list all the objects in the bucket my-bucket with the prefix folder/subfolder/, make an object listing request using this URL:

"https://storage.googleapis.com/my-bucket?prefix=folder/subfolder/"

Removing simulated folders

Because simulated folders don't actually exist, you can typically remove simulated folders by renaming objects so that the simulated folder is no longer a part of the object's name. For example, if you have an object named folder1/file, you can remove the simulated folder folder1/ by renaming the object to just file.

However, if you have used a tool that creates zero-byte objects as folder placeholders, such as the Google Cloud console, you must delete the zero-byte object to remove the folder.

Hierarchical namespace

Hierarchical namespace lets you organize objects within a Cloud Storage bucket in a file system like hierarchy of folders. Hierarchical namespace improves performance and helps you efficiently manage your data. To learn more about hierarchical namespace and when to use it, see Hierarchical Namespace.

Object immutability

Objects are immutable, which means that an uploaded object cannot change throughout its storage lifetime. An object's storage lifetime is the time between successful object creation, such as uploading, and successful object deletion. In practice, this means that you cannot make incremental changes to objects, such as append operations or truncate operations. However, it is possible to replace objects that are stored in Cloud Storage, and doing so happens atomically: until the new upload completes, the old version of the object is served to readers, and after the upload completes, the new version of the object is served to readers. A single replacement operation marks the end of one immutable object's lifetime and the beginning of a new immutable object's lifetime.

The generation number for an object changes each time you replace the object's data. Thus, the generation number uniquely identifies an immutable object.

Note that there is a once-per-second limit for rapidly replacing the same object. Replacing the same object more frequently might result in 429 Too Many Requests errors. You should design your application to upload data for a particular object no more than once per second and handle occasional 429 Too Many Requests errors using an exponential backoff retry strategy.

What's next