Encryption Amazon Redshift cluster migration when the metadata

By Allen Warren,2015-11-22 16:19
24 views 0
Encryption Amazon Redshift cluster migration when the metadata

    Encryption Amazon Redshift cluster migration when the


    A customer come to our help, let us help them to expand and modify themAmazon RedshiftThe cluster.In the process of response to their request, we take advantage ofAWSLabs GitHubSome tools available in the warehouse.The following explains how you can like we use some of these tools.

    The customer is buying another smaller than itself.Two companies have a BI infrastructure, they believe that their platform for joint will reduce the cost, simplify the operation.They want to be buying organization warehouse moved to the existing Amazon Redshift cluster, but there is a new requirement.Due to the nature of the acquired enterprise projects, they have an encrypted data contract obligations.

    Amazon Redshift support static data encryption, the encryption of data in the database, as well as the associated snapshot of encryption.To start the encryption function, must be selected when creating the database encryption.Is a must to encrypted database after database creation, to create a new database, will move into the new content unsecured cluster cluster, to encrypt the content in the new cluster.

    Will move your application data table content is very simple, because Amazon Redshift provides a UNLOAD feature to achieve this goal.

    To determine the uninstall data table, consider running a query command as shown below:

    SELECT tablename FROM pg_tables WHERE schemaname = ‘public’;

    Note that the schema name needs to be extended to reflect in the cluster where you create the object.UNLOAD the running of the source in the cluster, copy in the new cluster can move application data.Pretty simple!

    UNLOAD ('SELECT * FROM sample_table') TO 's3://mybucket/sample/sample_Table_' credentials 'aws_access_key_id=;aws_secret_access_key=

    y>' manifest;

    This command will be the result of a SELECT statement into a set of files, each node of one or more files, simplifies the data parallel reloading process.It also created a listing files, to ensure the COPY command to encryption in the cluster to load all the required file, and just have to file.In the COPY command using the listing file is a recommended practice.

    You can still use Amazon Redshift Unload/Copy utility (Amazon Redshift

    unloading/Copy tools) will continue to simplify this process.This tool will be data from a source in a position of export to S3 in the cluster, using Amazon KMS service to encrypt data.It can also be data import to another Amazon Redshift in the cluster, and eliminate the S3 data.

    For many applications, Amazon Redshift contains not only the application data.Amazon Redshift support to create a database user, user group, and the privileges assigned to users and user groups.Accurately recreate these data may be error-prone, unless all data using a script to create, and each script is controlled by the source code.Fortunately, create scripts, directly from the data source cluster migration, the encryption in the cluster to run these scripts to copy the data, you need is very simple.

    Before the actual creation scripts, we began to is the best place to AWSLabs making warehouse.In AdminViews directory, there have been a number of useful scripts.You can have the pattern, data sheets, and view creation DDL.You can also according to user access patterns, view and table privileges list, check the user belongs to group of users.All of these are the useful information, but you want to do is to create SQL statements in your source database, and performs these statements in your new encrypted database.

    You can be as shown in the figure below, get the user list from a pg_user table: SELECT 'CREATE USER '|| usename || ';' FROM pg_user WHERE usename <> 'rdsdb';


    CREATE USER acctowner;

    CREATE USER mrexample;

    CREATE USER counterexample;

    CREATE USER mrspremise;

CREATE USER mrsconclusion;

    You should be assigned to you create account password.There is no way to existing extracted from the source database password, so you must create a new password.

    Out to download code, expand the SRC directory, find the script in the AdminViews directory.In your Amazon Redshift in the cluster to create a pattern called admin, run each begin with v_ script to create a view.You can use the SQL statement as shown below, access to create the view:

    SELECT * FROM admin.v_generate_schema_ddl;

    Schema name: Admin

    ddl: Create schema admin

    Run v_generate_group_DDL. SQL scripts, in the new database to create user groups: SELECT 'CREATE GROUP '|| groname ||';' FROM pg_group;


    CREATE GROUP chosen;

    CREATE GROUP readonly;

    The user belongs to a user group.You can use v_get_users - in_group script for these associations:

    SELECT 'ALTER GROUP ' ||groname||' ADD USER '||usename||';' FROM



    ALTER GROUP chosen ADD USER mrsconclusion;

    ALTER GROUP readonly ADD USERmrexample;

    ALTER GROUP readonly ADD USERmrspremise;

    Can run the appropriate script directly from the script generated in the model, view and data table DDL: v_generate_schema_DDL.SQL,



    You need to set up a file in the new database set appropriate privileges for each pattern.You can run the following script in the existing database access to relevant information:

    SELECT * FROM admin.v_get_schema_priv_by_user

    WHERE schemaname

    NOT LIKE 'pg%'

    AND schemaname <> 'information_schema'

    AND usename <> 'johnlou'

    AND usename <> 'rdsdb';

    Here you can see to each user's different permissions, these users have been given the privilege mode.Want to create a run on a new database in the SQL, you can use a UDF (user defined function) for each row in a result set to create a series of privileges.The following shows the create a method of the function:

    create function

    f_schema_priv_granted(cre boolean, usg boolean) returns varchar


    AS $

     priv = ''

     if cre:

     priv = str('create')

     if usg:

     priv = priv + str(', usage')

     return priv

    $LANGUAGE plpythonu

    F_schema_priv_granted function returns a series of tandem permissions.In the query command to run the

    function to generate contains GRANT statements of SQL: SELECT 'GRANT '|| f_schema_priv_granted(cre, usg) ||' ON schema '|| schemaname || ' TO '

    || usename || ';'

    FROM admin.v_get_schema_priv_by_user

    WHERE schemaname NOT LIKE 'pg%'

    AND schemaname <> 'information_schema'

    AND usename <> 'rdsdb';


    GRANT CREATE, USAGE ON schema public TO mrexample; GRANT CREATE, USAGE ON schema public TO counterexample; GRANT CREATE, USAGE ON schema public TO mrspremise; GRANT CREATE, USAGE ON schema public TO o mrsconclusion; As an alternative, if you prefer a CASE statement to UDF, or are not used to python, you can write a similar

    to the following:

    SELECT 'grant '|| concat(CASE WHEN cre is true THEN 'create' else ' ' END,

    CASE WHEN usg is true THEN ', usage' ELSE ' ' END ) || ' ON schema '|| schemaname || ' TO ' || usename || ';' FROM admin.v_get_schema_priv_by_user

    WHERE schemaname NOT LIKE 'pg%'

    AND schemaname <> 'information_schema'

AND schemaname <> 'public'

    AND usg = 'true';

    Similarly, the UDF can be used to create a series of rights, these rights in each view and table used in GRANT statements.Have a wider range of privileges: the SELECT, INSERT, UPDATE, DELETE, and REFERENCES.UDF is similar to shown below:

    create function

    f_table_priv_granted(sel boolean, ins boolean, upd boolean, delc boolean, ref boolean) returns varchar


    AS $

     priv = ''

     if sel:

     priv = str('select')

     if ins:

     priv = priv + str(', insert')

     if upd:

     priv = priv + str(', update')

     if delc:

     priv = priv + str(', delete')

     if ref:

     priv = priv + str(', references ')

     return priv

    $LANGUAGE plpythonu

    Note that in this function, the fourth sentence statement did not match any column in the view.Del when used in combination with other objects, for it is a reserved word.Also please note that if you don't like to use the UDF, you can also use a CASE statement structure with the same function with a SQL statement.

    You can use the following query command for the data table to generate privileges

    SELECT 'GRANT '|| f_table_priv_granted(sel, ins, upd, del, ref) || ' ON '|| schemaname||'.'||tablename ||' TO '|| usename || ';' FROM admin.v_get_tbl_priv_by_user WHERE schemaname NOT LIKE 'pg%'

    AND schemaname <> 'information_schema'

    AND usename <> 'rdsdb';


    GRANT SELECT on public.old_sample to mrexample;

    GRANT SELECT ON public.old_sample TO mrspremise;

    GRANT SELECT, INSERT, UPDATE, DELETE, REFERENCES ON public.old_sample TO mrsconclusion; GRANT SELECT, INSERT, UPDATE, DELETE, REFERENCES ON public.sample TO mrexample; GRANT SELECT ON public.sample to mrspremise;

    Similarly, run the following query command to view generation privileges:

    SELECT 'GRANT '|| f_table_priv_granted(sel, ins, upd, del, ref) || ' ON '|| schemaname||'.'||tablename ||' TO '|| usename || ';' FROM admin.v_get_view_priv_by_user WHERE schemaname NOT LIKE 'pg%'

    AND schemaname <> 'information_schema'

    AND usename <> 'rdsdb';



    Script makes migration in the warehouse metadata to the new encryption cluster easier.The data table from the target company warehouse move to Amazon Redshift of an independent mode, and several other script has also been proved to be very useful.

    ; Table_info. SQL script shows pct_unsorted and pct_stats_off columns, that run vacuum pressing

    processing and analysis process.

    ; Able_inspector. SQL script to help verify the data sheet of the selected for migration distribution

    key is likely to be effective.Results including pct_skew_across_slices, percentage of skewed data

distribution, and pct_slices_populated.Problem data table are those that have high

pct_skew_across_column_slices value or low pct_slices_populated value data table.

Report this document

For any questions or suggestions please email