700字范文,内容丰富有趣,生活中的好帮手!
700字范文 > mapreduce实现——腾讯大数据QQ共同好友推荐系统【你可能认识的人】

mapreduce实现——腾讯大数据QQ共同好友推荐系统【你可能认识的人】

时间:2020-09-25 21:20:55

相关推荐

mapreduce实现——腾讯大数据QQ共同好友推荐系统【你可能认识的人】

独角兽企业重金招聘Python工程师标准>>>

基于腾讯大数据QQ共同好友推荐系统,我们基于mapreduce来实现下

测试数据:前面代表QQ用户,:后面代表用户QQ好友

A:B,C,D,F,E,O

B:A,C,E,K

C:F,A,D,I

D:A,E,F,L

E:B,C,D,M,L

F:A,B,C,D,E,O,M

G:A,C,D,E,F

H:A,C,D,E,O

I:A,O

J:B,O

K:A,C,D

L:D,E,F

M:E,F,G

O:A,H,I,J

可以分两个步骤来实现

第一步map和reduce阶段

1.先根据用户QQ好友作为key,用户作为value来输出:

比如第一行:B->A C->A D->A F->A E->A O->A

第二行:A->B C->B E->B K->B

然后根据相同的key使用reduce来聚合

比如输出:B->A B->C B->D....

reduce会根据相同的key分发给相应的reduce task来执行

结果输出为:B->A C D...

再写代码组合输出:

A-C B

A-D B

C-D B

.....

代码实现

package com.xuyu.friends;import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.Mapper;import org.apache.hadoop.mapreduce.Reducer;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;import java.io.IOException;import java.util.ArrayList;import java.util.Collections;public class CommonFriendsOne {public static class CommonFriendsMapper extends Mapper<LongWritable,Text,Text,Text>{Text k=new Text();Text v=new Text();//A:B,C,D,F,E,O//输出:B->A C->A D-A...@Overrideprotected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {String[] userAndFriends = value.toString().split(":");String user=userAndFriends[0];String[] friends= userAndFriends[1].split(",");v.set(user);for (String f:friends){k.set(f);//friend usercontext.write(k,v);}}}public static class CommonFriendsOneReduce extends Reducer<Text,Text,Text,Text>{//一组数据:B->A E F J...//一组数据:C->B F E J...//所以需要排序@Overrideprotected void reduce(Text friend, Iterable<Text> users, Context context) throws IOException, InterruptedException {ArrayList<String> userList=new ArrayList<String>();for (Text user:users){userList.add(user.toString());}//排序Collections.sort(userList);//组合输出for(int i=0;i<userList.size()-1;i++){for(int j=i+1;j<userList.size();j++){context.write(new Text(userList.get(i)+"-"+userList.get(j)),friend);}}}}public static void main(String[] args) throws Exception{Configuration conf = new Configuration();Job job = Job.getInstance(conf);//动态获取jar包在哪里job.setJarByClass(CommonFriendsOne.class);//2.封装参数:本次job所要调用的mapper实现类job.setMapperClass(CommonFriendsMapper.class);job.setReducerClass(CommonFriendsOneReduce.class);//3.封装参数:本次job的Mapper实现类产生的数据key,value的类型job.setMapOutputKeyClass(Text.class);job.setMapOutputValueClass(Text.class);//4.封装参数:本次Reduce返回的key,value数据类型job.setOutputKeyClass(Text.class);job.setOutputValueClass(Text.class);FileInputFormat.setInputPaths(job,new Path("F:\\mrdata\\friends\\input"));FileOutputFormat.setOutputPath(job,new Path("F:\\mrdata\\friends\\output"));boolean res = job.waitForCompletion(true);System.exit(res ? 0:-1);}}

结果输出

part-r-000000文件

B-CAB-DAB-FAB-GAB-HAB-IAB-KAB-OAC-DAC-FAC-GAC-HAC-IAC-KAC-OAD-FAD-GAD-HAD-IAD-KAD-OAF-GAF-HAF-IAF-KAF-OAG-HAG-IAG-KAG-OAH-IAH-KAH-OAI-KAI-OAK-OAA-EBA-FBA-JBE-FBE-JBF-JBA-BCA-ECA-FCA-GCA-HCA-KCB-ECB-FCB-GCB-HCB-KCE-FCE-GCE-HCE-KCF-GCF-HCF-KCG-HCG-KCH-KCA-CDA-EDA-FDA-GDA-HDA-KDA-LDC-EDC-FDC-GDC-HDC-KDC-LDE-FDE-GDE-HDE-KDE-LDF-GDF-HDF-KDF-LDG-HDG-KDG-LDH-KDH-LDK-LDA-BEA-DEA-FEA-GEA-HEA-LEA-MEB-DEB-FEB-GEB-HEB-LEB-MED-FED-GED-HED-LED-MEF-GEF-HEF-LEF-MEG-HEG-LEG-MEH-LEH-MEL-MEA-CFA-DFA-GFA-LFA-MFC-DFC-GFC-LFC-MFD-GFD-LFD-MFG-LFG-MFL-MFC-OID-ELE-FMA-FOA-HOA-IOA-JOF-HOF-IOF-JOH-IOH-JOI-JO

第二步:将第一步输出的结果文件再进行一次map reduce

输出结果应该为这种:第一行表示:A和B用户都有C和E这两个共同好友

A-B[C, E]A-C[D, F]A-D[E, F]A-E[B, C, D]A-F[B, C, D, E, O]A-G[C, D, E, F]

代码实现

import org.apache.hadoop.conf.Configuration;import org.apache.hadoop.fs.Path;import org.apache.hadoop.io.LongWritable;import org.apache.hadoop.io.Text;import org.apache.hadoop.mapreduce.Job;import org.apache.hadoop.mapreduce.Mapper;import org.apache.hadoop.mapreduce.Reducer;import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;import java.io.IOException;import java.util.ArrayList;import java.util.Collections;public class CommonFriendsOne2 {public static class CommonFriendsMapper extends Mapper<LongWritable,Text,Text,Text>{@Overrideprotected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {String[] friendsAndUser = value.toString().split("\t");context.write(new Text(friendsAndUser[0]),new Text(friendsAndUser[1]));}}public static class CommonFriendsOneReduce extends Reducer<Text,Text,Text,Text>{@Overrideprotected void reduce(Text friends, Iterable<Text> users, Context context) throws IOException, InterruptedException {ArrayList<String> userList=new ArrayList<String>();for (Text user:users){userList.add(user.toString());}//排序Collections.sort(userList);context.write(friends,new Text(userList.toString()));}}public static void main(String[] args) throws Exception{Configuration conf = new Configuration();Job job = Job.getInstance(conf);//动态获取jar包在哪里job.setJarByClass(CommonFriendsOne2.class);//2.封装参数:本次job所要调用的mapper实现类job.setMapperClass(CommonFriendsMapper.class);job.setReducerClass(CommonFriendsOneReduce.class);//3.封装参数:本次job的Mapper实现类产生的数据key,value的类型job.setMapOutputKeyClass(Text.class);job.setMapOutputValueClass(Text.class);//4.封装参数:本次Reduce返回的key,value数据类型job.setOutputKeyClass(Text.class);job.setOutputValueClass(Text.class);FileInputFormat.setInputPaths(job,new Path("F:\\mrdata\\friends\\output"));FileOutputFormat.setOutputPath(job,new Path("F:\\mrdata\\friends\\output2"));boolean res = job.waitForCompletion(true);System.exit(res ? 0:-1);}}

统计输出结果为

A-B[C, E]A-C[D, F]A-D[E, F]A-E[B, C, D]A-F[B, C, D, E, O]A-G[C, D, E, F]A-H[C, D, E, O]A-I[O]A-J[B, O]A-K[C, D]A-L[D, E, F]A-M[E, F]B-C[A]B-D[A, E]B-E[C]B-F[A, C, E]B-G[A, C, E]B-H[A, C, E]B-I[A]B-K[A, C]B-L[E]B-M[E]B-O[A]C-D[A, F]C-E[D]C-F[A, D]C-G[A, D, F]C-H[A, D]C-I[A]C-K[A, D]C-L[D, F]C-M[F]C-O[A, I]D-E[L]D-F[A, E]D-G[A, E, F]D-H[A, E]D-I[A]D-K[A]D-L[E, F]D-M[E, F]D-O[A]E-F[B, C, D, M]E-G[C, D]E-H[C, D]E-J[B]E-K[C, D]E-L[D]F-G[A, C, D, E]F-H[A, C, D, E, O]F-I[A, O]F-J[B, O]F-K[A, C, D]F-L[D, E]F-M[E]F-O[A]G-H[A, C, D, E]G-I[A]G-K[A, C, D]G-L[D, E, F]G-M[E, F]G-O[A]H-I[A, O]H-J[O]H-K[A, C, D]H-L[D, E]H-M[E]H-O[A]I-J[O]I-K[A]I-O[A]K-L[D]K-O[A]L-M[E, F]

版权@须臾之余/u/3995125

本内容不代表本网观点和政治立场,如有侵犯你的权益请联系我们处理。
网友评论
网友评论仅供其表达个人看法,并不表明网站立场。