????SQL????????????????????????????NoSQL??Not Only SQL????????????SQL????MySQL?????????HDFS??????У???????????????Sqoop?????HDFS?е??????Hive??????????Sqoop?????????????????????????????NoSQL???????????????????????????
????01 ???????
????SQL?????????????????????????????????????????????????????????????????????????????????????????Щ???????μ???????????????????????SQL????????Щ??????????????NoSQL??
????NoSQL?????????No SQL??????????????SQL??????????GNU??????дGNU is Not Unix???????????????????????NoSQL?????????????SQL?????úú?????????????????????????á??????????????????????Щ???????????????????ó??????SQL??????????????NoSQL????????????Not Only SQL????????????SQL?????к????????????????????????????????????????????????????????NoSQL?????£??????????????SQL???????
????NoSQL?????????????????????????Mongodb??Hadoop??Hive??Cassandra??Hbase??Redis?????????????????Щ??????????????к????????á???SQL????????????????MySQL???
???????????У???????????????MySQL??????е???????????????????汾??MySQL????????????????????????????????????????????Hadoop????????У?????????????Sqoop?????????????????????SQL??NoSQL??????????????
????02 MySQL????HDFS
??????MySQL????HDFS??????У?????????????????????????????????????????????HDFS?У??????á?
????Sqoop??????÷??????????????????????????????????sqoop????????в???д?????????????д????????????檔???????????????????????????д????????????棬????????????????HDFS??????У???????????2?????
???????import???
??????????????????????IP?????????????????????
????????·??????????????????·??
???????н??????????????????ν????з?
???????????????η???
???????????????????Limit???????????
??????????????????????
????????????????
# ???????your_table.options
import
--connect
jdbc:mysql://1.2.3.4/db_name
--username
your_username
--password
your_passwd
--table
your_table
--null-string
NULL
--columns
id?? name
#
--query
# select id?? name?? concat(id??name) from your_table where $CONDITIONS limit 100
#
--where
# "status != 'D'"
--delete-target-dir
--target-dir
/pingjia/open_model_detail
--fields-terminated-by
'01'
--split-by
id
--num-mappers
1
????????????????
????import????????????????“??”???????hdfs??????????MySQL????hdfs??????С?
??????????????????????????connect?????????????????????????????mysql??ip??????????????
????username?? password?????????????table??????????????????????????????????query?????????????????????table???????
????columns?????????ж??????Σ??????????????????????????????????????query?????????columns
????query??????????????sql?????????????嵼????????á????????????where???????????????????????????????where $CONDITIONS??$CONDITIONS????????????????
????where??????????ò??????
????target-dir??????????????????mysql?е???hdfs?е???????????????????????????????????????????????ɡ?????????????hive????÷????????????????????????????磬???????year=2015????????????????????????????????????year=2015???????????????hive??????????????伴?ɡ?delete-target-dir?????????????????????????
????fields-terminated-by???????????????????????????????????????????????????????????????’01′???Hive?????????η??????????????????????????Hive?????????????????á?
????num-mappers????????е?mapper(??????)??????????sqoop?????????????п???????????????4????????С?????split-by???????????????????????id??????????????????????з??4??????????????????????????????ü????????????????????????????????????????С???????????num-mappers?1?????????????????
????????????????????????????????????У????????#??????????????У???????????????????ú????????????????????ò????
????sqoop --options-file your_table.options
??????????????????????????????С???????????
??????????????????????????ο???
????Transferred 3.9978 GB in 811.4697 seconds (5.0448 MB/sec)
????Retrieved 18589739 records.
????Transferred 3.4982 GB in 350.2751 seconds (10.2266 MB/sec)
????Retrieved 16809945 records.
????Transferred 846.5802 MB in 164.0938 seconds (5.1591 MB/sec)
????Retrieved 5242290 records.
????Transferred 172.9216 MB in 72.2055 seconds (2.3949 MB/sec)
????Retrieved 1069275 records.