Sunday, February 19, 2012

convert INT to BIGINT

Hi guys

We have a table A which has an Id column of type INT. There are currently about 260705246 rows in the table and growing at 100,000,000 rows a month. We will hit the limits of INT pretty soon at this rate. So I was asked to study the different options to convert the INT to BIGINT for the table. Conditions: (1) The downtime should be the least (2) the method should be 100% reliable and no unknown factors going into the rollout. I did my research and came up with a couple of methods:

(1) ALTER table ( I know this will not work)

(2) create a new column in the same table. repopulate the column with existing data using a job. modify all the stored procs/business objects to update/insert the new column instead of the old one. Drop the old column. rename the new column back to what it was ( or leave it as is). Recreate the indexes/FK constraints with the new column instead of the old column.

(3) create a new table B with BIGINT datatype. BCP the data from A into B. create all the indexes. drop the old tables. rename the new table to the old one.

I am suggesting method 3 as it requires least number of operations => less changes (and less places to) of screw up and less complicated.

If anyone else who has done this before can provide some feedback/do's and dont's etc I would apprecaite it.

Use SQL Management studio to change the column type from int to bigint. Click save. Tell everyone you are working hard on it, and go to lunch.

This will essentially do #1 if it's possible. If it's not, it will do #3 automatically for you. The only downside is that with that many rows (depending on how big each row is), this could take some significant time. You may want to try this on a test system first to make sure it will complete in the time you have allotted.

|||

I should have mentioned. We have SQL 2000 on production and its a 64 bit processor.

We tried #1 on an evaluation machine and it failed.

|||Right click and register the 64bits version in Enterprise manager of SQL Server 2000 32 bits and it becomes local, we used Oracle 9i the same way so I just tried it with SQL Server and it worked like a charm. So you can use Enterprise manager to change the data type or run ANSI SQL ALTER TABLE to change INT to BIGINT|||We already tried the ALTER TABLE approach and it failed.Normally I wouldnt be posting this question if its a simple ALTER statement issue. There are millions of records in the table. The server just froze when we tried it. And we cannot afford any downtime on the server for trials like these unless its a solid solution. Any other ideas?|||

Is this an identity column?

Please send me a private message with a script to recreate your table structure including indexes, and let me know if you have any foreign keys to or from this table and I'll get you an answer within a couple days. I got some disk space, processor cycles and time to spare ;-)

Anything else you can tell me about your production server would help. How many CPUs, RAM, and what disk topology? (Logs on raid-1, data on raid 5?) An approximate table size would help as well.

|||Create a blank database then create a new table with Big INT and run an INSERT INTO statement against the table to move the data, then move the new table into your database. If it works just truncate or drop the original table. The last option is to use DTS to move the table back into your database. Run some tests with a copy of your production database.|||This operation needs to be done on a PROD server which is 64 bit version. I believe DTS does not work on 64bit processor. We have tried traditional methods of (1) Insert Into (2) create a new BIGINT column in the same table, update the new column with the old col, drop the old col but with 900 million rows, it blows up. One thing is clear to me. Any solution would involve "batch" process. We cannot do a simple update or drop on the entire table. we need to do for a few hundred thousand rows at a time. As of now the only solution I have ( havent tried that out yet) that seems to work is to bcp out the table into multiple data files, and bacp back into a new table (again in batch) with BigInt alreadt defined and drop the old table and rename the new table to the old one. I am currently working on a replication issue and should be done by the end of the week. Early next week I should start doing some prototypes of the bcp solution and see how it works.|||Not really SQL Server 2005 comes with 64 bits eval versions use it to do the DTS now integration services operation to move the table back to your production SQL Server 2000 64 bits. I was a beta one tester of the product in 2001 the only known issues are pointer arithemetic problems not data migration. XP 64 bits eval can also be downloaded from Microsoft site. BTW both 64bits are AMD products because Intel's Itanium is still for server products.|||

Our PROD server is windows 2000 Data Center Edition. 64GB RAM; 32 processors. 64 bit processor.

|||SQL Server 2000 64bits does not run on Windows 2000 it runs on Windows 2003 64bits Microsoft sent us a pre beta 1 build of the product to do the testing in 2001. If it does run on Windows 2000 I am not aware of it. That said operating system is not important to what you want to do, that is move a table to SQL Server 2005 64bits on XP 64bits and sending the table back to SQL Server 2000 64bits. Just focus on data transfer between SQL Server.|||Yes. Almost every option we tried failed because of the huge data transfer. 900 mi rows is not easy to do trial and error. Yes you are right. I found out we have Win2003 Data center edition.|||

Option 1 may work, but it will take a very long time to complete.

Option 2 won't work at all, because this method won't work with identity fields.

Option 3 might work, but you would probably be better off just INSERT INTO ... SELECTing the data from one table to the other rather than using BCP.

I've got an option 4 I'm working on, but it's coming pretty slow. I got hammered last week because of the MS/Eolas spat and had to do some emergency changes for someone. Nearing a billion rows is a lot of data, took me almost 4 hours just to load test data, and that's not including building indexes.

I'm curious how your SAN performs compared to my desktop at home. A good SAN should put it to shame, but there are lots of people out there that don't know how to configure one. Coupled with the fact that you said an similiar attempt was made years ago and failed, unless you've upgraded your SAN, it's getting pretty old (And probably relatively slow). In any case, I'll have a solution for you in a couple of days (Sorry, I only run tests during the day when I'm not actively using the machine, cause the tests literally kill the machine... Multithreaded disk I/O wasn't my highest priority when I built it). I'm pretty sure I can get it down to a couple hours (meaning 2) downtime, if your SAN runs atleast as fast as my desktop. I'm sure the extra 31 processors and 62GB of ram will help some too, but not a whole lot.

|||

Thanks for your efforts Motley. I am sure our SAN must be up to date (I dont know much about the storage area yet). Time is exactly the problem I have. We do have an alternate site with exactly similar configuration as our PROD ( win 2k3 DC edition, 32 processors, 64 GB ram, SQL x64 edition etc) and if I have a solid plan I can ask for a few hours of access to the ALT site for my testing. But they will only let me use it if I have atleast reasonable confidence the solution work and when they do give me access I need to use every minute of it.

I do have other ideas in mind but like I said, I am not sure how they work with such huge data.

Thanks again.

|||

Well, my idea is that since the data load takes such a extreme amount of time to do, I'm building a process to create a duplicate table (tmpWhatever) in the background.

Unfortunately because of the sheer size of your table, the only way I see of efficiently doing so takes a few steps. Creating triggers to log changes to an audit log (insert, update,delete). Then begin loading data in the background row by row (More accurately small batch by small batch), and then once we have all the rows, replay the audit log on the temporary table (small batches) until we catch up. This should allow you (if you have the disk space to spare) to bring down the system at any point, run a stored procedure to make sure you've caught the last of the changes, drop your FK's, rename old to tmp2, rename tmp to old, and apply the FK's to new table, and you are done. I need to find out if the sp_rename stored procedure will let the indexes follow the rename or not. If it will then you'll need about 5 minutes of down time. If it won't, and you need to rebuild the indexes, then it'll take a while longer, but shouldn't be more than a couple hours.

The other question I have relates to the FK relationship. Is it valid to have a bigint primary key, as a foreign key into another table that is integer? If it is (and I don't really see why it wouldn't be, but it's possible it's not), then you can do each table separately, if it isn't then you'll need to do both processes at the same time.

No comments:

Post a Comment