Register

If this is your first visit, please click the Sign Up now button to begin the process of creating your account so you can begin posting on our forums! The Sign Up process will only take up about a minute of two of your time.

Page 1 of 2 1 2 LastLast
Results 1 to 10 of 12
  1. #1
    WDF Staff Wired's Avatar
    Join Date
    Apr 2003
    Posts
    7,656
    Member #
    1234
    Liked
    137 times
    Ok how to word this...

    With a large table, in normalization you separate any repeating information into its own table, i.e. on a shipping database you would have their shipping address stored, and then reference it with a unique identifier. So if you have a main table that references that shipping address, by normalization, you would grab the Shipping_ID from the main table, and then join it with the info from the Shipping_Address table.

    Here's the question. To get to this main table, the shipping info is already pulled from the table in an earlier step. Would it be more efficient to store this shipping info in some php variables, and then print them to screen with the rest of the main table info rather than simply running an extra query and joining it like the above example?

    The proposed way would do 2 queries: 1 for the inital gathering of the shipping info (which is needed in a previous step) and a 2nd query to grab the info in the main table.

    The Normalization way would need 3: 1 for the initial gathering, 1 for the main table, and 1 to join the Shipping_ID contained in the main table with the information linked to that Shipping_ID in the Shipping table.

    So, whatcha think? Oh by the way, we might do a comparison of the ID # originally gabbed in the initial gathering, and the Shipping_ID from the main table to double check that it's correct, but with the way we're getting to the individual entries in the main table, this would only be a redundant check.
    The Rules
    Was another WDF member's post helpful? Click the like button below the post.

    Admin at houseofhelp.com

  2.  

  3. #2
    WDF Staff smoseley's Avatar
    Join Date
    Mar 2003
    Location
    Boston, MA
    Posts
    9,729
    Member #
    819
    Liked
    205 times
    The most efficient way isn't always the best way. You have to decide what meets your client's needs best.

    You should begin thinking of your database and business logic separately.

    For instance, a "Company" is an object, and may also be represented by data, but not necessarily.

    When you think of ways to manipulate that Company object, don't think of how you're manipulating the data, but rather, assume that you already have the data.

    Then, you can abstract your database and create retrieval and storage procedures in the implementation of that abstraction (basically, you create your database connectivity separately from your business logic processing).

    In the example you showed, you might be working with an Address where Address.Type.Description = "Shipping". You should be able to get Address.Parent and thus retrieve your Company. Don't worry about how the data layer is doing it, though.

    As for the object oriented vs. functional approach.... you can abstract and layer in each... it's just a little different how you would do it.

    Personally, if I were you, I would find some way of modeling your data in your code (OO is prefered) and continue from there. Although it's not the fastest performer, it will open up new doors for you in the future.

  4. #3
    WDF Staff Wired's Avatar
    Join Date
    Apr 2003
    Posts
    7,656
    Member #
    1234
    Liked
    137 times
    HUH? This has nothing to do with business, it's for ME. The shipping was just an example. In reality, it has nothing to do with selling or shipping or anything like that. It was just the first UID that came to mind

    I'm only interested in what would be more efficient, or if there was any real benefit in doing it with 2 queries and some temp storage in a php var, or just with 3 queries.

    The basic reasoning behind this question is because I'm working on a project with a friend. I was taught to use normalization, but have no experience in this, while he does this for a living and doesn't use it. The server is very robust, so there's no real advantage for us to do it either way, so we've agreed to do it his way, with 1 huge table (20 fields) and a couple of lookup tables. Does make the PHP code a tad easier. The php var idea popped into my head, and I was wondering if anyone had done it before, or if there was even any advantage to it.
    The Rules
    Was another WDF member's post helpful? Click the like button below the post.

    Admin at houseofhelp.com

  5. #4
    WDF Staff smoseley's Avatar
    Join Date
    Mar 2003
    Location
    Boston, MA
    Posts
    9,729
    Member #
    819
    Liked
    205 times
    Ok, in terms of just the database... it's like this:

    The fastest possible table structure you can achieve is the "flat" structure (First Normal Form)... totally denormalized, all in one table. Every bit of data pertaining to a data element would exist in one table, with 100s of columns, many repeated for the ability to have a one-to-many relationship. No foreign keys, and indexed on search fields.

    Yeah, it's the fastest, which is why it's used for reporting in multi-million record databases and also used for active directory, meta-directories, and LDAP structures. But those are the ONLY useful purposes for this structure.

    Flat tables lost popularity as a usable database architecture over 10 years ago, and have been replaced by normalized data structures, because nowadays, usability and integrity are more important than speed.

    There are over 5 different "forms" of normalization... Third Normal Form being the most popular for most data structures. You should verse yourself in all of them, though, and get to know them well... here are the first 3:
    • First Normal Form - flat table structure (see above)
    • Second Normal Form - any data that's repeated (one-to-many relationships), such as "phone_numbers" in relation to a "person" get stored in other tables.
    • Third Normal Form - any meta-data that's repeated in the context of a data field gets put into a "domain" for unique constraints and referential integrity. For instance, in a table called "users", a "user_type" of "admin" might be replaced with a "user_type_id" of "1", with a foreign key reference to the new "user_types" table, which contains the unique row "1, 'admin'".

  6. #5
    WDF Staff Wired's Avatar
    Join Date
    Apr 2003
    Posts
    7,656
    Member #
    1234
    Liked
    137 times
    OK... so the PHP idea's not worth it then I take it? gotcha
    The Rules
    Was another WDF member's post helpful? Click the like button below the post.

    Admin at houseofhelp.com

  7. #6
    WDF Staff smoseley's Avatar
    Join Date
    Mar 2003
    Location
    Boston, MA
    Posts
    9,729
    Member #
    819
    Liked
    205 times
    How you deal with your data in your program is a different question entirely. Your question is not unlike the age-old question of caching data. The truth of the matter is that most database servers employ their own caching algorithms already, so implementing a second layer of caching in your application is only redundant.

    As for the which route you should use in php... performance will probably not be altered significantly enough for it to make a difference in most cases. If and when you get to the point that you need to optimize your database and business logic, it's time to start thinking of using a platform that supports subqueries, UDFs, and robust OO... of course, PHP and MySQL are on their way to becoming just that, so just hang out and by the time you're ready, they'll be there with ya

  8. #7
    WDF Staff Wired's Avatar
    Join Date
    Apr 2003
    Posts
    7,656
    Member #
    1234
    Liked
    137 times
    Oh! so if you call the address list once and grab an entry, then a couple of sec later you need it again for a join, it's not going to query the server, it's just going to go onto the cache, which is less overhead ( think). That correct? Dunno if that stuff will be ready by Jan (est release date).
    The Rules
    Was another WDF member's post helpful? Click the like button below the post.

    Admin at houseofhelp.com

  9. #8
    WDF Staff smoseley's Avatar
    Join Date
    Mar 2003
    Location
    Boston, MA
    Posts
    9,729
    Member #
    819
    Liked
    205 times
    All depends on the specific database software you're using and how you have caching set up on it (and if it supports it at all).

    Some application servers / web servers (e.g. apache) also employ their own caching internally.... which will complicate the matter even more !!!

    For most applications, you don't even have to worry about it though.

  10. #9
    WDF Staff Wired's Avatar
    Join Date
    Apr 2003
    Posts
    7,656
    Member #
    1234
    Liked
    137 times
    What would be better in your opinion, a database with normalization and maybe up to 6 joins to display an entry (yes, there's that much repeatable info) or everything banched into a big table. maybe 20 fields max.
    The Rules
    Was another WDF member's post helpful? Click the like button below the post.

    Admin at houseofhelp.com

  11. #10
    WDF Staff smoseley's Avatar
    Join Date
    Mar 2003
    Location
    Boston, MA
    Posts
    9,729
    Member #
    819
    Liked
    205 times
    Normalized. In an "members" table of a dating site, for example, you might see:

    id,
    username,
    password,
    first_name,
    last_name,
    ssn,
    gender_id REFERENCES genders(id),
    hair_color_id REFERENCES hair_colors(id),
    eye_color_id REFERENCES eye_colors(id),
    fitness_level_id REFERENCES fitness_levels(id),
    religion_id REFERENCES religions(id),
    education_level_id REFERENCES education_levels(id),
    occupation_id REFERENCES occupations(id),
    industry_id REFERENCES industries(id),
    salary_level_id REFERENCES salary_levels(id)

    Among many others. You should only do your JOINs on the "domain" data when you need the name. Internally, the id should be sufficient for you to work with (hair color 1, for instance, will always represent "blonde" for your purposes internally), and will be superior in terms of performance to the string data that you woudl consider using in its place.


Page 1 of 2 1 2 LastLast

Remove Ads

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
All times are GMT -6. The time now is 09:55 AM.
Powered by vBulletin® Version 4.2.3
Copyright © 2019 vBulletin Solutions, Inc. All rights reserved.
vBulletin Skin By: PurevB.com