Register

If this is your first visit, please click the Sign Up now button to begin the process of creating your account so you can begin posting on our forums! The Sign Up process will only take up about a minute of two of your time.

Results 1 to 6 of 6
Like Tree1Likes
  • 1 Post By TheGAME1264

Thread: How to Download and Format Data from the Web

  1. #1
    Junior Member
    Join Date
    Jun 2015
    Posts
    7
    Member #
    50149

    Question How to Download and Format Data from the Web

    Hey guys,

    I am currently designing a national business directory site similar to YP or Angieís List which will list local businesses of all kinds and allow users to search, browse, review, and rate businesses in their area. While most of the businesses will be added to my site by the users themselves (registering their own businesses on the site) I do not want to launch my directory site completely empty of listings. So I figured I would pick a few locations and list the businesses in the area which leads to my issue. It is going to take months to create a page for each business even for just one small area. I can upload a CSV file containing the information of each business to save time, but finding all of the info and then formatting it correctly in a CSV document will still take forever.

    So my question is; is there a program/software that can recognize and auto-populate the necessary fields in my database from the data on a webpage so that I donít have to copy/paste and format thousands of rows of data? For example, if I want to create a list of all plumbers in Washington DC using the information available on YP.com can I somehow setup a database program like excel to automatically pull the businessís name, address, phone number, hours of operation, etc. into the appropriate fields? If this sort of software is available anywhere I havenít been able to find it, and if it is not available does anyone has any ideas as to how I could go about developing software like this myself?

    Any help or suggestions would be really appreciated!

  2.  

  3. #2
    Unpaid WDF Intern TheGAME1264's Avatar
    Join Date
    Dec 2002
    Location
    Not from USA
    Posts
    14,483
    Member #
    425
    Liked
    2783 times
    First of all, I would advise against anything similar in concept to YP depending on how you're doing it. Certain business practices they have...at least here in Canada...are highly suspect and they often sell business owners services that no sane business owner would purchase if (s)he knew what (s)he was getting. That sounds vague, but I will be explaining that in the near future (read: within a week or two).

    But as far as your question is concerned, you can go about this a few ways, but your bigger problem may not be technical...it may be legal.

    1) You can buy business databases. Depending on the exact information you want (name, address, phone, email, website, # of employees, years in business, etc.) and the region(s) targeted the costs vary.

    2) You can get your information from an API from a source that offers business listings. YP does this itself, as do some others. You'd have to look at the individual APIs for more information as far as what they offer and what you can use.

    3) You can "scrape", or "data mine" the results from various sites. This is where you have to be very careful because you start entering into the grey "is this legal?" area. Simple rule of thumb: if you don't have explicit permission, don't do it.

    Side note: a lot of the databases in #1 are from scraped sites. Ask or research how the database information was gathered before buying the database.
    squints likes this.
    If I've helped you out in any way, please pay it forward. My wife and I are walking for Autism Speaks. Please donate, and thanks.

    If someone helped you out, be sure to "Like" their post and/or help them in kind. The "Like" link is on the bottom right of each post, beside the "Share" link.

    My stuff (well, some of it): My bowling alley site | Canadian Postal Code Info (beta)

  4. #3
    Junior Member
    Join Date
    Jun 2015
    Posts
    7
    Member #
    50149
    Thanks for the reply Game! My site is 100% free service so it won't be a ripoff like YP.

    I understand the methods you have described in order to get the business information, but how would I get the business info to populate the correct fields within my own database (probably Excel) without copying and pasting each value?

  5. #4
    Unpaid WDF Intern TheGAME1264's Avatar
    Join Date
    Dec 2002
    Location
    Not from USA
    Posts
    14,483
    Member #
    425
    Liked
    2783 times
    You'll usually want some sort of importer. If you're using Excel, this is something native to SQL Server Management Studio...you can import spreadsheets and give them table names to work with. You can also choose which fields you want to import (or not) and what datatypes they are. It's actually pretty easy.

    If you're using something else...that's where someone else would have to step in and provide insight.
    If I've helped you out in any way, please pay it forward. My wife and I are walking for Autism Speaks. Please donate, and thanks.

    If someone helped you out, be sure to "Like" their post and/or help them in kind. The "Like" link is on the bottom right of each post, beside the "Share" link.

    My stuff (well, some of it): My bowling alley site | Canadian Postal Code Info (beta)

  6. #5
    Junior Member lucky ali's Avatar
    Join Date
    Jun 2015
    Location
    uae
    Posts
    7
    Member #
    50074
    i have been trying to populate database in my testing phase in one of my project. I found nothing awesome to use . then i found a chrome extension to fill the form with dummy data. i used form fill extension for chrome to fill and save forms to have some database.

  7. #6
    Junior Member
    Join Date
    Jun 2015
    Posts
    7
    Member #
    50149
    If anyone else wants to know...

    After trying a few different solutions I ended up getting a web scraper which works well for the task. I am currently using "Easy Web Extract" which does the job (most of the time) and was by far the cheapest option. Here's the link: Web Scraper, Web Extractor, Screen Scraper, Web Ripper


Remove Ads

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

Tags for this Thread

All times are GMT -6. The time now is 05:35 PM.
Powered by vBulletin® Version 4.2.3
Copyright © 2019 vBulletin Solutions, Inc. All rights reserved.
vBulletin Skin By: PurevB.com