Friday, March 30, 2012
Is there a system generated row id in SQL Server 2000?
I wanted to perform a select and a delete of a row based on a system generat
ed row id (like what Oracle has). However after researching this issue I've
found that there is no equivalent of Oracle Rowid in SQL Server.
I know I can generate my own row numbers using identity columns and then pro
cess the results but I was curious to know whether or not SQL Server actuall
y had a system generated row id. My thinking was that each row has to have a
row id because otherwise h
ow will b-tree indexes work (especially on non unique columns)?
Documentation doesn't really give out much information on what the Row struc
ture looks like in SQL Server (or at least I haven't been able to find it) n
ot have I found information on exactly what does a row pointer looks like in
a B-Tree index leaf page.
Can anyone suggest a source of information on the above? Also, if there is a
system generated row id in SQL Server, is any way at all of accessing it?hi;
am not a SQL expert here, but i think there is a datatype call IDENTITY that
auto generate an sequencing row number. that cld be what u need.
amnyone can verify ?
"SJ" <anonymous@.discussions.microsoft.com> wrote in message
news:1196BF60-EF0E-4E47-853F-18E79C95279E@.microsoft.com...
> Hi,
> I wanted to perform a select and a delete of a row based on a system
generated row id (like what Oracle has). However after researching this
issue I've found that there is no equivalent of Oracle Rowid in SQL Server.
> I know I can generate my own row numbers using identity columns and then
process the results but I was curious to know whether or not SQL Server
actually had a system generated row id. My thinking was that each row has to
have a row id because otherwise how will b-tree indexes work (especially on
non unique columns)?
> Documentation doesn't really give out much information on what the Row
structure looks like in SQL Server (or at least I haven't been able to find
it) not have I found information on exactly what does a row pointer looks
like in a B-Tree index leaf page.
> Can anyone suggest a source of information on the above? Also, if there is
a system generated row id in SQL Server, is any way at all of accessing it?|||You will have to generate your own rowid... SQL does not expose one ( as
some other DBMS's do.)
Wayne Snyder, MCDBA, SQL Server MVP
Computer Education Services Corporation (CESC), Charlotte, NC
www.computeredservices.com
(Please respond only to the newsgroups.)
I support the Professional Association of SQL Server (PASS) and it's
community of SQL Server professionals.
www.sqlpass.org
"SJ" <anonymous@.discussions.microsoft.com> wrote in message
news:1196BF60-EF0E-4E47-853F-18E79C95279E@.microsoft.com...
> Hi,
> I wanted to perform a select and a delete of a row based on a system
generated row id (like what Oracle has). However after researching this
issue I've found that there is no equivalent of Oracle Rowid in SQL Server.
> I know I can generate my own row numbers using identity columns and then
process the results but I was curious to know whether or not SQL Server
actually had a system generated row id. My thinking was that each row has to
have a row id because otherwise how will b-tree indexes work (especially on
non unique columns)?
> Documentation doesn't really give out much information on what the Row
structure looks like in SQL Server (or at least I haven't been able to find
it) not have I found information on exactly what does a row pointer looks
like in a B-Tree index leaf page.
> Can anyone suggest a source of information on the above? Also, if there is
a system generated row id in SQL Server, is any way at all of accessing it?|||Yes I think the keywords here are "sql does not expose one". I have been sus
pecting that although there may be a system generated row id, it wouldn't be
accesible to external users.|||To the best of my knowledge there really isn't one.
I don't worry about internal page strcutures all that much, so I could be
wrong. I'm sure Kalen's book Inside SQL Server 2000 would have a lot of
great info on this topic if you're really interested.
But to the best of my knowledge, internal id's are based on both page number
and row offset (or a row number) on that specific page. I do not believe
there is an internal ID that is unique across a table space. The short
answer is that it doesn't matter since there definitely isn't one exposed.
But I don't think one even exists...
Brian Moran
Principal Mentor
Solid Quality Learning
SQL Server MVP
http://www.solidqualitylearning.com
"SJ" <anonymous@.discussions.microsoft.com> wrote in message
news:56E22894-9ACE-4654-B93B-7E375D21A367@.microsoft.com...
> Yes I think the keywords here are "sql does not expose one". I have been
suspecting that although there may be a system generated row id, it wouldn't
be accesible to external users.|||Brian is correct - there is no exposed unique RID for a row.
There are two types of RIDs we use internally, physical and logical.
Physical RIDs are F:P:S, where F is the file ID, P is the page number in the
file and S is the slot number on the page. These are only used for locating
heap rows from non-clustered indexes over heaps (i.e. each row in a
non-clustered index over a heap contains the physical RID of the
corresponding row in the heap itself). Physical RIDs are not exposed,
although you will see them sometimes in DBCC CHECK* error messages.
Logical RIDs are the keys of the index the row belongs too. Each row in a
non-clustered index has a logical RID. Each row in a non-clustered index
over a clustered index also contains the logical RID of the corresponding
row in the clustered index).
Regards.
Paul Randal
Dev Lead, Microsoft SQL Server Storage Engine
This posting is provided "AS IS" with no warranties, and confers no rights.
"Brian Moran" <brian@.solidqualitylearning.com> wrote in message
news:eSCl$8zEEHA.2404@.TK2MSFTNGP11.phx.gbl...
> To the best of my knowledge there really isn't one.
> I don't worry about internal page strcutures all that much, so I could be
> wrong. I'm sure Kalen's book Inside SQL Server 2000 would have a lot of
> great info on this topic if you're really interested.
> But to the best of my knowledge, internal id's are based on both page
number
> and row offset (or a row number) on that specific page. I do not believe
> there is an internal ID that is unique across a table space. The short
> answer is that it doesn't matter since there definitely isn't one exposed.
> But I don't think one even exists...
> --
> Brian Moran
> Principal Mentor
> Solid Quality Learning
> SQL Server MVP
> http://www.solidqualitylearning.com
>
> "SJ" <anonymous@.discussions.microsoft.com> wrote in message
> news:56E22894-9ACE-4654-B93B-7E375D21A367@.microsoft.com...
> suspecting that although there may be a system generated row id, it
wouldn't
> be accesible to external users.
>|||Thank you all for you responses.|||The only thing I would add is the reason that SQL Server doesn't expose row
ids. Systems that expose row ids require more offline and manual
maintenance than systems that don't expose row ids. That's because systems
that expose row ids can't just move rows around without breaking
applications. Systems that don't expose row ids can dynamically reorganize
data at any time since no user will ever request the data by rowid.
First generation systems such as Rdb and Oracle exposed rowids, second
generation systems such as Tandem and Sybase saw the errors in this and did
not expose them. Of course once exposed it's almost impossible to take away
the feature, so Oracle still has it.
Hal Berenson, SQL Server MVP
VP, Yukon Readiness
Scalability Experts, Inc.
"Paul S Randal [MS]" <prandal@.online.microsoft.com> wrote in message
news:uyL9fz3EEHA.1456@.TK2MSFTNGP09.phx.gbl...
> Brian is correct - there is no exposed unique RID for a row.
> There are two types of RIDs we use internally, physical and logical.
> Physical RIDs are F:P:S, where F is the file ID, P is the page number in
the
> file and S is the slot number on the page. These are only used for
locating
> heap rows from non-clustered indexes over heaps (i.e. each row in a
> non-clustered index over a heap contains the physical RID of the
> corresponding row in the heap itself). Physical RIDs are not exposed,
> although you will see them sometimes in DBCC CHECK* error messages.
> Logical RIDs are the keys of the index the row belongs too. Each row in a
> non-clustered index has a logical RID. Each row in a non-clustered index
> over a clustered index also contains the logical RID of the corresponding
> row in the clustered index).
> Regards.
> --
> Paul Randal
> Dev Lead, Microsoft SQL Server Storage Engine
> This posting is provided "AS IS" with no warranties, and confers no
rights.
> "Brian Moran" <brian@.solidqualitylearning.com> wrote in message
> news:eSCl$8zEEHA.2404@.TK2MSFTNGP11.phx.gbl...
be
> number
exposed.
been
> wouldn't
>
Wednesday, March 28, 2012
Is there a more efficient way to perform this process?
My e-commerce site is currently running the following process when items are shipped from the manufacturer:
1. Manufacturer sends a 2-column CSV to the retailer containing an Order Number and its Shipping Tracking Number.
2. Through the web admin panel I've built, a retail staff member uploads the CSV to the server.
3. In code, the CSV is parsed. The tracking number is saved to the database attached to the Order Number.
4. After a tracking # is saved, each item in that order has its status updated to Shipped.
5. The customer is sent an email using ASPEmail from Persits which contains their tracking #.
The process seems to work without a problem so long as the CSV contains roughly 50 tracking #'s or so. The retailer has gotten insanely busy and wants to upload 3 or 4 thousand tracking #'s in a single CSV, but the process times out, even with large server timeout values being set in code. Is there a way to streamline the process to make this work more efficiently? I can provide the code if that helps.
It really depends on how your database backend works. SQL Server? Transactions?
|||
SQL Server 2000. I am not using Transactions. Any suggestions you have as to how I can improve the performance he would be appreciated. Here's what the code looks like. Basically I am building a giant SQL string (multiple EXEC calls on the same procedure) with a stringbuilder and executing it all at once at the bottom:
objOleDb.Open()
objCmd = New OleDbCommand("SELECT * FROM TrackingImport.csv", objOleDb)
objRdr = objCmd.ExecuteReader()
While objRdr.Read()
If Len(Trim(objRdr(1).ToString)) > 0 And Len(Trim(objRdr(0).ToString)) > 0 Then
If IsNumeric(objRdr(0)) Then
If CLng(objRdr(0)) > 0 Then
strSQL.Append("EXEC spImportTracking '" & objRdr(1) & "'," & objRdr(0) & ";")
EmailTracking(objRdr(0), objRdr(1))
i = i + 1
End If
End If
End If
End While
'Update orders with tracking numbers
dbobj.Open()
objSQLCmd = New SqlCommand(strSQL.ToString(), dbobj)
objSQLCmd.ExecuteNonQuery()
The stored procedure called above looks like this:
AlterPROCEDUREspImportTracking
@.TrackingNumberNVARCHAR(50)=NULL,
@.OrderNumberINT=NULL
AS
DECLARE@.OrderAddressIDINT,
@.OrderIDINT,
@.OrderItemIDINT,
@.CountryVARCHAR(2)
/*1. Get OrderID*/
SELECT@.OrderID=(SELECTuid
FROMOrders
WHEREOrderNumber=@.OrderNumber)
/*2. Check Country*/
SELECT@.Country=(SELECTTOP1Country
FROMOrderAddresses
WHEREOrderID=@.OrderID
ANDType=3)
IF@.Country='CA'ANDLEFT(@.TrackingNumber,2)='1Z'
BEGIN
SELECT dbo.Orders.uid,dbo.Orders.OrderNumber,dbo.Customers.FirstName,dbo.Customers.LastName,dbo.Customers.EMail,
'Invalid'AS'TrackingNo'
FROM dbo.CustomersINNERJOIN
dbo.OrdersONdbo.Customers.uid=dbo.Orders.CustomerID
WHEREdbo.Orders.OrderNumber=@.OrderNumber
END
ELSE
BEGIN
/*3. Insert to Tracking*/
INSERTINTOOrderTracking(BackOrderFlag,TrackingNumber,TrackingMessage)
VALUES(0,@.TrackingNumber,@.OrderNumber)
/*2. Get all unpaid (full or partial) orders for thiscustomer*/
DECLAREOrder_CursorCURSORFOR
SELECT OrderItems.uid
FROM OrderItems
WHERE OrderItems.OrderID=@.OrderID
OPENOrder_Cursor
FETCHNEXTFROMOrder_Cursor
INTO@.OrderItemID
WHILE@.@.FETCH_STATUS=0
BEGIN
/*2. Mark All Items in this Order as Shipped*/
EXECspUpdateProductionStatus@.OrderItemID,100
FETCHNEXTFROMOrder_Cursor
INTO@.OrderItemID
END
CLOSEOrder_Cursor
DEALLOCATEOrder_Cursor
/*Return Customer information for confirmation Email*/
SELECT dbo.Orders.uid,dbo.Orders.OrderNumber,dbo.Customers.FirstName,dbo.Customers.LastName,dbo.Customers.EMail,@.TrackingNumberAS'TrackingNo'
FROM dbo.CustomersINNERJOIN
dbo.OrdersONdbo.Customers.uid=dbo.Orders.CustomerID
WHEREdbo.Orders.OrderNumber=@.OrderNumber
END
SETNOCOUNTON
The EmailTracking method that gets called for each record looks like this:
PrivateSubEmailTracking(ByVal intOrderNumberAsLong,ByVal strTrackingNoAsString)
Dim objEmailAsNew CEmail()
Dim strBodyAsNewStringBuilder()
Dim dbobjAsNewSqlConnection(ConfigurationSettings.AppSettings("strConn"))
Dim objCmdAsSqlCommand
Dim objRdrAs SqlDataReader
Dim strBillCountryAsString =""
Try
dbobj.Open()
objCmd =NewSqlCommand("spGetShippingAddresses", dbobj)
objCmd.CommandType = CommandType.StoredProcedure
objCmd.Parameters.Add("@.OrderNumber",intOrderNumber)
objRdr = objCmd.ExecuteReader()
objEmail.strFrom = "shipping@.mydomain.com"
objEmail.strFromName = "Shipping Department"
objEmail.strSubject = "Ship Notification, TrackingNumber " & strTrackingNo
strBody.Append(strIntro)
While objRdr.Read
strBillCountry =objRdr("BillCountry").ToString
objEmail.strTo = objRdr("Email").ToString
strBody.Append("<b>Ship To:</b><br>"& objRdr("Recipient").ToString & "<br>" &_
objRdr("Address").ToString &"<br>" & objRdr("City").ToString &"<br>" & objRdr("State").ToString & _
"<br>" &objRdr("Zip").ToString & "<br>" &objRdr("Country").ToString & "<hr>")
IfobjRdr("WSClient").ToString <> ""Then
UpdatePhotoMgr(objRdr("WSClient").ToString,objRdr("PO#").ToString, strTrackingNo)
EndIf
EndWhile
strBody.Append("<p><b>Tracking Number:</b>" & strTrackingNo & "</p>" & _
"<p><b>Reference Order #:</b>" & intOrderNumber & "</p><p>"& _
SetTrackingLink(strTrackingNo))
strBody.Append(strEnd)
objEmail.strMessage = strBody.ToString
If strBillCountry ="US"Or strBillCountry = ""Or (strBillCountry = "CA"And Left(strTrackingNo, 1) = "D")Then
IfLen(objEmail.strTo) > 0Then
objEmail.SendMail()
Else
Response.Write("<script language='javascript'>alert('AddressMissing for Order " & _
intOrderNumber.ToString &"')</script>")
EndIf
EndIf
Catch objErrorAs Exception
lblError.Text = "Error occurred when sending to" & objEmail.strTo & ": " & objError.Message & _
". Please try re-loading the page, or contact technical support with adescription of this problem."
'Notify tech support
Dim objErrAsNew CEmail()
objErr.SendErrorMessage(objError, "Tracking EmailError", objEmail.strTo, Session("CompanyID"),Session("StoreID"))
objErr =Nothing
context.ClearError()
ExitTry
Finally
objRdr.Close()
objCmd.Dispose()
dbobj.Close()
EndTry
EndSub
1 thought I had would be rather than going to the database each time to retrieve the customer info for email purposes, I could grab all the customer info once in a dataset and then filter with a dataview 1 record at a time to send the emails. Not sure if that would help or not.
|||
1. Using the OleDbConnection to read the file is cool, but I think you could just as well just use a plain StreamReader, get each rows and split the string by the comma.
2. Don't use plain SQL strings. An evil CSV file could wipe out the entire database (consider the row "abc,1;DROP TABLE Orders")
3. Execute everything within a transaction. Not only could it help performance, it's also vital since this operation must be atomic. All the orders should be updated, and if something fails, nothing should happen.
4. Use several calls to spImportTracking (properly parametrized) and execute them each with its own ExecuteNonQuery.
5. Treat the email sending as a separate process. It could fail, and the orders need to know this. Add a status column to the Order table to indicate that an email should be sent, and process it afterwards. Don't return anything from spImportTracking.
6. What does spUpdateProductionStatus do? If it's just setting a status the entire cursor block should be replaced with someting like:
UPDATE OrderItems SET ProductionStatus=100 WHERE OrderID = @.OrderID
Extra:
Log each file upload (who, what and when) in a table with some kind of identifier. Store that identfier with each order (e.g add a column called "batchFileId" referring to the log table). It's good for tracking down errors or mistakes and can also use it when sending the emails (e.g send all emails relating to batchFileid=345)
Don't prefix your stored procedures with "sp". That prefix is reserved for system stored procedures.
An alternative would be to import the actual file contents into a table, and then make one single massive update by relating to that table (the import could be done using a DataAdapter, the SqlBulkUpdate class or just plain inserts (what the adapter would do anyway))|||
Thanks for such a thoughtful reply. I don't usually use SQL strings, but figured appending the Exec statements together and calling ExecuteNonQuery once was more efficient than doing in a loop. I will try that as you've suggested. I will test it out with the StreamReader, and yes, #6 makes a lot more sense. The UpdateProductionStatus procedure has a lot of logic it in to determine the order status which I guess I don't need in this case since I know what the status should be already (and the cursor isn't needed either).
The StreamReader thing was really a minor issue, so don't wast too much energy on fixing that. Remember to use transactions, i.e
BeginTransaction
A lot of ExecuteNonQueries
If success then
Commit
Else
Rollback
End If