DOC

Design and Implementation of Domain-specific Business Information Search System in Electronic Commerce Environment(1)

By Pauline Parker,2014-08-06 14:35
15 views 0
Abstract: In Electronic Commerce (EC) environment, the quality of business information directly affects the level of enterprise operations. This paper analyses the common methods of business information retrieval in EC environment, and design a software system which can gather business information in internet automatically and extract business information demanded by enterprise from database directly. The system adopts meta-search engine to extend search range, and applies information retrieval, web mining and agent technology to analyze and filter the business information, improved the search quality of business information. Key Words: Electronic Commerce (EC); Business information; Information Retrieval (IR); Meta-Search Engine (MSE)

Design and Implementation of Domain-specific Business Information

    Search System in Electronic Commerce Environment

    111, 21Ruijun Xia, Qing Wang, Dingwei Wang, Lili Liu

    1. Institute of System Engineering, Information College, Northeastern University, Shenyang, 110004

    E-mail: qwang@mail.neu.edu.cn

    2. Modern Logistics Center, Shanghai University, Shanghai, 200072, China

    Abstract: In Electronic Commerce (EC) environment, the quality of business information directly affects the level of enterprise operations. This paper analyses the common methods of business information retrieval in EC environment,

    and design a software system which can gather business information in internet automatically and extract business information demanded by enterprise from database directly. The system adopts meta-search engine to extend search range, and applies information retrieval, web mining and agent technology to analyze and filter the business information, improved the search quality of business information.

    Key Words: Electronic Commerce (EC); Business information; Information Retrieval (IR); Meta-Search Engine

    (MSE)

    ; ? Using specialized search engines to search business 1 INTRODUCTION information, it has a good search result, but the amount of

    results is limited, and relies on the database of their site. In recent years, the application of Electronic Commerce

    In view of the above problems, this paper designs a business (EC) becomes more and more widespread. The enterprises

    information search system in Electronic Commerce need more and more business information such as raw

    environment, which can gather business information in material, product, supplier and customer, and use this

    internet automatically and extract business information information to provide information support for

    demanded by enterprise form database directly. The system decision-making of enterprise. So, whether or not the

    adopts meta-search engine which can be integrated with enterprise in Electronic Commerce environment would

    several General-purpose Search Engines(GSE) to extend access to the accurate, comprehensive and necessary

    search range and improve the recall, and applies business information in time will bear on the success and

    information retrieve, web mining and agent technology to failure of Electronic Commerce operation. The enterprise

    analyze and filter business information, extract customer, must go beyond the relatively narrow operation

    supplier and product information which has potential value environment in the past, collect and use business

    to enterprise, improved the precision. information effectively.

    2 DOMAIN-SPECIFIC BUSINESS In Electronic Commerce environment, the main methods to

    INFORMATION SEARCH search business information for enterprise are as following:

    ? Using General-purpose Search Engine (GSE) to Search There are many scholars researched for the [2-6]business information, it covers a wide range of business domain-specific business search. Paper [2] proposed an information but contains too many irrelevant pages, results agent-based framework for dynamic information retrieve in a low precision, and could not meet the personalization process to manage the business status intelligently and requirements of user. dynamically. Paper [3] presented a method to build ? Logging in web site of enterprise to search business personalized domain-specific search engine, adopted information, it can get information accurately such as type domain-based grading thesauruses and Chinese and price of products of this enterprise, but the search range segmentation algorithm with disambiguation mechanism is very limited and also result in a low recall. to ensure high accuracy, and adopted retrospective, state ? Logging in large business portal website to search memory and linear nature of segmentation algorithm to business information, it contains a lot of product ensure engines efficiency. Paper [4] proposed a Hopfield information, but not all the enterprises issue their product neural network based business search algorithm, a set of information to this website, so in comparison with entire extended query terms are generated automatically by business information in Internet, the amount of Hopfield neural network in accordance with the query information in these site are very limited, and could not keywords the users input. Searching general-purpose meet the requirements of enterprise. search engine with those extended query terms can extend

     search range and improve search precision. Paper [5] This work is supported by National Nature Science Foundation under proposed a Bayesian Network (BN) based business Grant 74105110, Innovative Research Team Project of National Natural information retrieve model, in this model the customized Science Foundation under grant 60821063.

query requirement of enterprise is expressed in terms of the 3.2 Architecture based on MSE

    predefined illustrative documents related to business This system adopts MSE based architecture as shown in domain. The similarities between the documents and the Fig 1. It is divided into 3 main modules, including query are evaluated with the conditional probabilities meta-search and system search module, user search module among the nodes in the BN. Paper [6] proposed a method and user interaction module. Each module includes various for building Domain-specific search engine based on sub-modules. Meta-Search Engine (MSE) on internet, It selects keywords [8] This system applies Luceneas database to enhance by the Odds Ratio (OR) method and weights them by the indexing and retrieve functions. Lucene is a full-text TF-IDF method. Domain query expression is derived by the indexing tool wrap based on Java, which provides a number Decision Tree (DT) method. Finally, it ranks the returned of API functions and flexible data storage structure(can be documents by the Extended Boolean Model. The method customized), and can be easily embedded into various can effectively remedy the drawbacks of KS method and applications to achieve or enhance indexing and retrieval can perform better in terms of precision and recall. functions. Being different from other databases, Lucene Based on the study and analysis to existing theory research, stores information in the form of index file, and the retrieve this paper designs and implements a domain-specific speed quicker than other databases. In addition, it doesn’t business search software which adopted MSE as framework, adopt B-tree structure which cause a large number of IO applies the theory to practical system in the form of operation as updating index, but creates a new index file, modularization and then merges these small index files into a large one, so

    as to enhance the indexing efficiency without affecting the 3 DESIGN OF DOMAIN-SPECIFIC

    search efficiency. BUSINESS INFORMATION SEARCH This system sets up index and achieves user search SYSTEM functions through APIs provided by Lucene primarily. We In order to help business person to get information such as could also add some information retrieve model (such as commodity, supplier and customer, and provide reference Hopfield neural network based information retrieve model [4]for further inquiry and commodity pricing, the system is ) to the user search module, enable user to get more designed to collect business information required by precise business information. enterprise in internet automatically according to the

    character of business information.

    3.1 Main functions of the system

    ? Meta-search engine function: The user can enter

    several keywords belonged to the field of business, search

    business information from several GSE, remove duplicated

    and invalid pages, parse pages, and extract the abstract or

    full text of the pages.

    ? System search function: The system can gather

    relevant information regularly and automatically in

    internet according to the pre-determined system search

    keywords and search time, and deposit them in the

    database.

    ? User search function: The system can retrieve the

    database according to query statement entered by user. As

    business inform