基于百度AI开放平台的人脸识别及语音合成
基于百度ai的人脸识别及语音合成课题
课题需求
(1)人脸识别
在web界面上传人的照片,后台使用java技术接收图片,然后对图片进行解码,调用云平台接口识别人脸特征,接收平台返回的人员年龄、性别、颜值等信息,将信息返回到web界面进行显示。
(2)人脸比对
在web界面上传两张人的照片,后台使用java技术接收图片,然后对图片进行解码,调用云平台接口比对照片信息,返回相似度。
(3)语音识别
在web页面上传语音文件,判断语音文件格式,如果不是wav格式进行转码处理,然后调用平台接口进行识别,最后将识别的文本内容返回到web界面进行显示。
(4)语音合成
在web界面上传文本内容和语音类型,后台接收文本内容和语音类型后,调用平台接口生成语音数据,最后将数据转码成mp3格式文件,web界面可以下载到本地。
课题设计
课题基于客户端—服务端-平台端构架,客户端主要实现功能界面展示、数据上传和处理结果展示;服务器端接收客户端数据、数据转码处理、平台接口调用、请求结果相应;平台端介绍服务端数据、人脸识别、人脸比对、语音识别、语音合成等。
总体架构
总体逻辑
前端设计(包括首页、人脸检测、人脸对比、语音识别及语音合成)
index.html
<!doctype html> <html lang="zh-cn"> <head> <meta charset="utf-8"> <title>人工智能 未来已来</title> <link rel="stylesheet" href="css/button.min.css" /> <link rel="stylesheet" href="css/style.css" /> <script type='text/javascript' src='js/jquery-1.11.1.min.js'></script> <script type='text/javascript' src='js/jquery.particleground.min.js'></script> <script type='text/javascript' src='js/ai.js'></script> </head> <body> <div id="context"> <div class="intro"> <div class="position"> <h1>人工智能 未来已来</h1> <!--start button, nothing above this is necessary --> <div class="svg-wrapper"> <svg height="140" width="450" xmlns="http://www.w3.org/2000/svg"> <rect id="shape" height="140" width="300" /> <div id="text"> <a href="face_recognition.html"><span class="spot"></span>人脸检测</a> </div> </svg> </div> <div class="svg-wrapper"> <svg height="140" width="450" xmlns="http://www.w3.org/2000/svg"> <rect id="shape" height="140" width="300" /> <div id="text"> <a href="face_match.html"><span class="spot"></span>人脸比对</a> </div> </svg> </div> <!--next button --> <div class="svg-wrapper"> <svg height="140" width="450" xmlns="http://www.w3.org/2000/svg"> <rect id="shape" height="140" width="300" /> <div id="text"> <a href="speech_recognition.html"><span class="spot"></span>语音识别</a> </div> </svg> </div> <!--next button --> <div class="svg-wrapper"> <svg height="140" width="450" xmlns="http://www.w3.org/2000/svg"> <rect id="shape" height="140" width="300" /> <div id="text"> <a href="speech_produce.html"><span class="spot"></span>语音合成</a> </div> </svg> </div> <!--end button --> </div> </div> </div> </body> </html>
face_recognition.html
<!doctype html> <html> <head> <meta charset="utf-8" /> <title>人脸识别</title> <meta name="viewport" content="width=device-width,initial-scale=1,minimum-scale=1,maximum-scale=1,user-scalable=no" /> <script src="js/jquery-2.1.0.js" type="text/javascript" charset="utf-8"></script> <script src="js/jquery.particleground.min.js" type="text/javascript" charset="utf-8"></script> <script src="js/ai.js" type="text/javascript" charset="utf-8"></script> <link rel="stylesheet" type="text/css" href="css/style.css" /> <style type="text/css"> .cent_bg { width: 80%; height: 22em; border: 1px solid rgba(255, 255, 255, 0.5); margin: auto; padding: 3% 6%; border-radius: 10%; font-size: 1.5em; line-height: 2em; position: relative; } .cent { margin-top: 1.5em; overflow-y: auto; height: 80%; text-indent: 2em; text-align: left; padding-right: 0.2em; } ::-webkit-scrollbar { width: 10px; background-color: rgba(255,255,255,0.2); } /*定义滚动条轨道 内阴影+圆角*/ ::-webkit-scrollbar-track { border-radius: 5px; background-color: transparent; } /*定义滑块 内阴影+圆角*/ ::-webkit-scrollbar-thumb { border-radius: 10px; background-color: rgba(255,255,255,0.7); } .btn{ margin-top: 20px; width: 70%; height:80px; transition: 0.4s; border-radius: 1em; border: 1px solid rgba(255, 255, 255, 0.5); border-top:none; z-index: 66; background: rgba(255, 255, 255, 0.2); } .btn:hover{ border: 1px solid rgba(255, 255, 255, 0.8) !important; border-top:none !important; } .btn span{ width: 40%; height: 84%; margin-top:6px; line-height: 38px; display: inline-block; border: 1px solid #009ffd; color: #fff; background: rgba(255,255,255,0.4); border-radius: 4px; cursor: pointer; } .btn span:nth-of-type(1){ margin-right: 15px; } .btn span:nth-of-type(2){ margin-left: 15px; } .img{ width: 70%; height: 100%; float: left; position: relative; } .xinxi{ width: 30%; height: 100%; float: left; } .biankuang{ width: 92%; height: 150%; top: -100px; position: absolute; background: url(img/biankuang.png) no-repeat; background-size: 100% 100%; } .xinxi{ text-align: left; } .xinxi span{ margin-top: 50px; } .xinxi span a{ color: #ffffff; text-decoration: none; } #neirong{ color: red; } </style> </head> <body> <div id="context"> <div class="intro"> <div class="cent_bg"> <div class="img"> <div class="biankuang"> <img src="img/img111.jpg" id='img' style="width: 83%;height: 55%;position: absolute;left: 49px;top: 150px;"/> </div> </div> <div class="xinxi"> <span style="display: block;">性别:<a href="" id="sex"></a></span> <span style="display: block;">年龄:<a href="" id="age"></a><span style="margin-left: 10px;">岁</span></span> <span style="display: block;">表情:<a href="" id="expression"></a></span> <span style="display: block;">颜值:<a href="" id="beauty"></a></span> <span style="display: block;"><a href="" id="neirong"></a></span> </div> </div> <div class="btn"> <form id="renlian" method="post"> <span style="position: relative;">提交图片 <input type="file" name="image" value="" id="uploading" onchange="test()" style="opacity: 0;width: 100%;position: absolute;height: 100%;display: block;top: 0px;" /> </span> <span id="tijiao">开始识别</span> </form> </div> </div> </div> </body> </html> <script type="text/javascript"> function test() { var file = document.getelementbyid("uploading").files[0]; var fr = new filereader; var filepath = document.queryselector("#uploading").value; fileformat = filepath.substring(filepath.lastindexof(".")).tolowercase(); if(!fileformat.match(/.png|.jpg|.jpeg/)) { alert('上传错误,文件格式必须为:png/jpg/jpeg'); return; } else { fr.readasdataurl(file); fr.onload = function(e) { document.getelementbyid("img").src = this.result; } } } $("#tijiao").click(function() { $.ajax({ type: "post", url: basepath + "/facedetect", datatype: "json", data: new formdata($('#renlian')[0]), processdata: false, contenttype: false, beforesend: function() { uploading = true; }, success: function(res) { if(res.status=="200"){ $("#neirong").text(""); $("#sex").text(res.data.gender); $("#age").text(res.data.age); $("#expression").text(res.data.expression); $("#beauty").text(res.data.beauty); }else{ $("#neirong").text("无法识别"); } }, error(xhr,status,error){ $("#neirong").text("后台服务异常"); return; } }) }) </script>
face_match.html
<!doctype html> <html> <head> <meta charset="utf-8"> <title>人脸比对</title> <script src="js/jquery-2.1.0.js" type="text/javascript" charset="utf-8"></script> <script src="js/jquery-2.1.0.js" type="text/javascript" charset="utf-8"></script> <script src="js/jquery.particleground.min.js" type="text/javascript" charset="utf-8"></script> <script src="js/ai.js" type="text/javascript" charset="utf-8"></script> <link rel="stylesheet" type="text/css" href="css/style.css" /> <style type="text/css"> #dianji:hover{ transition: 0.5s; background: skyblue; } .sss{ display: inline-block; line-height:100px ; border-radius: 100%; width: 200px; margin: auto; height: 200px; border: 1px solid white; margin-top: 10%; } </style> </head> <body> <div id="context"> <div class="intro"> <div style="overflow: hidden; position: relative;top: 40%; overflow: hidden;margin: auto;text-align: center;"> <div style="border: 1px solid white; width: 30%; height: 400px; float: left;"><img id="ig1" src="" alt="" style="width: 100%; height: 100%"/></div> <div class="sss">相似度</div> <div style="border: 1px solid white; width: 30%; height: 400px; float: right;"><img id="ig2" src="" alt="" style="width: 100%; height: 100%;"/></div> </div> <form id="tttt" method="post" style="width: 80%; margin:40px auto;"> <div style="width: 40%;float: left; position: relative;"> <span style="position: absolute;left: 0px;background: skyblue;display: inline-block;height: 40px;line-height: 40px;width: 160px;"> 点击上传 </span> <input style="opacity: 0; position: absolute;left: 0px; height: 40px;" id="uploading" type="file" onchange="upload1()" name="image1"> </div> <div style="float: right;bwidth: 40%;float: right;position: relative;"> <span id="" style="position: absolute;right: 0px;background: skyblue;display: inline-block;height: 40px;line-height: 40px;width: 160px;"> 点击上传 </span> <input id="up" type="file" onchange="upload2()" name="image2" style="opacity: 0; position: absolute;right: 0px; height: 40px;" > </div> </form> <div id="dianji" style="border: 1px solid white;width: 140px;height: 40px;line-height: 40px;border-radius: 15px; margin: auto;">开始比对</div> </div> </div> </body> <script type="text/javascript"> function upload1() { var file = document.getelementbyid("uploading").files[0]; var fr = new filereader; var filepath = document.queryselector("#uploading").value; fileformat = filepath.substring(filepath.lastindexof(".")).tolowercase(); if(!fileformat.match(/.png|.jpg|.jpeg/)) { alert('上传错误,文件格式必须为:png/jpg/jpeg'); return; } else { fr.readasdataurl(file); fr.onload = function(e){ document.getelementbyid("ig1").src = this.result; } } } function upload2() { var file = document.getelementbyid("up").files[0]; var fr = new filereader; var filepath = document.queryselector("#up").value; fileformat = filepath.substring(filepath.lastindexof(".")).tolowercase(); if(!fileformat.match(/.png|.jpg|.jpeg/)) { alert('上传错误,文件格式必须为:png/jpg/jpeg'); return; } else { fr.readasdataurl(file); fr.onload = function(e){ document.getelementbyid("ig2").src = this.result; } } } $(function() { $("#dianji").click(function() { $.ajax({ url: basepath+"/facematch", type: 'post', cache: false, data: new formdata($('#tttt')[0]), processdata: false, contenttype: false, datatype: "json", beforesend: function() { uploading = true; }, success: function(data) { if(data.status==200){ document.queryselector(".sss").innerhtml="相似度<br>"+data.data.score; }else{ $(".sss").html("相似度<br><span style='color:red'>"+data.msg+"</span>"); } }, error(xhr,status,error){ $(".sss").html("相似度<br><span style='color:red'>后台服务异常</span>"); return; } }); }); }); </script> </html> speech_recognition.html <!doctype html> <html> <head> <meta charset="utf-8" /> <title>语音识别</title> <meta name="viewport" content="width=device-width,initial-scale=1,minimum-scale=1,maximum-scale=1,user-scalable=no" /> <script src="js/jquery-2.1.0.js" type="text/javascript" charset="utf-8"></script> <script src="js/jquery.particleground.min.js" type="text/javascript" charset="utf-8"></script> <script src="js/ai.js" type="text/javascript" charset="utf-8"></script> <link rel="stylesheet" type="text/css" href="css/style.css" /> <style type="text/css"> .cent_bg { width: 80%; height: 22em; border: 1px solid rgba(255, 255, 255, 0.5); margin: auto; padding: 3% 6%; border-radius: 10%; font-size: 1.5em; line-height: 2em; position: relative; } .cent { margin-top: 1.5em; overflow-y: auto; height: 80%; text-indent: 2em; text-align: left; padding-right: 0.2em; } ::-webkit-scrollbar { width: 10px; background-color: rgba(255, 255, 255, 0.2); } /*定义滚动条轨道 内阴影+圆角*/ ::-webkit-scrollbar-track { border-radius: 5px; background-color: transparent; } /*定义滑块 内阴影+圆角*/ ::-webkit-scrollbar-thumb { border-radius: 10px; background-color: rgba(255, 255, 255, 0.7); } #ttttt{ display: none; } .btn { margin-top: 20px; width: 70%; height: 80px; transition: 0.4s; border-radius: 1em; border: 1px solid rgba(255, 255, 255, 0.5); border-top: none; z-index: 66; overflow: hidden; position: relative; background: rgba(255, 255, 255, 0.2); } .btn:hover { border: 1px solid rgba(255, 255, 255, 0.8) !important; border-top: none !important; } .btn span { width: 40%; height: 84%; margin-top: 6px; line-height: 38px; display: inline-block; border: 1px solid #009ffd; color: #fff; background: rgba(255, 255, 255, 0.4); border-radius: 4px; cursor: pointer; } .btn span:nth-of-type(1) { margin-right: 15px; position: relative; } .btn span:nth-of-type(2) { margin-left: 15px; } input{ width: 100%; height: 100%; border: 1px solid red; position: absolute; top: 0; left: 0; opacity: 0; } .img{ width: 60px; height: 60px; margin: auto; text-align: center; position: relative; } .img img{ width: 100%; height: 100%; position: absolute; top: 0; left: 0; } .mmm{ width: 100%; height: 100%; position: absolute; top: 0; left: 0; z-index: 999; display: none; cursor: pointer; background: #b9fff4; color: aqua; line-height: 80px; } </style> </head> <body> <div id="context"> <div class="intro"> <div class="cent_bg"> <h3>语音识别内容:</h3> <div class="cent"> </div> </div> <div class="btn"> <span>上传文件 </span> <span>开始识别</span> <div class="mmm"> 正在识别... </div> </div> <form id="ttttt" action="" method="post"> <input type="file" name="voice" value=""> </form> </div> </div> </body> </html> <script type="text/javascript"> $(function() { $(".btn span:nth-of-type(1)").click(function() { $('#ttttt input[name="voice"]').click(); }) $(".btn span:nth-of-type(2)").click(function() { if(document.queryselector("input").value==""){ return alert("未选择文件"); } $(".mmm").css("display","block"); $(".cent").html('<div class="img"><img src="img/timg.gif"/></div>'); $.ajax({ url: basepath + "/voicerecognize", type: 'post', cache: false, data: new formdata($('#ttttt')[0]), processdata: false, contenttype: false, datatype: "json", beforesend: function() { uploading = true; }, success: function(data) { if (data.status=="200") { $(".cent").html(data.data.text); $(".mmm").css("display","none"); } else{ $(".cent").html("未能识别 "+data.msg); $(".mmm").css("display","none"); } } }); }); }); </script>
speech_produce.html
<!doctype html> <html> <head> <meta charset="utf-8" /> <title>语音合成</title> <meta name="viewport" content="width=device-width,initial-scale=1,minimum-scale=1,maximum-scale=1,user-scalable=no" /> <script src="js/jquery-2.1.0.js" type="text/javascript" charset="utf-8"></script> <script src="js/jquery.particleground.min.js" type="text/javascript" charset="utf-8"></script> <script src="js/ai.js" type="text/javascript" charset="utf-8"></script> <link rel="stylesheet" type="text/css" href="css/style.css" /> <style type="text/css"> #error { width: 60%; height: 120px; line-height: 120px; text-align: center; font-size: 1.5em; position: absolute; top: 50%; left: 50%; transform: translate(-50%, -65%); z-index: 999; } .cent_bg { width: 80%; height: 22em; border: 1px solid rgba(255, 255, 255, 0.5); margin: auto; padding: 3% 6%; border-radius: 10%; font-size: 1.5em; line-height: 2em; position: relative; } .cent { margin-top: 1em; width: 100%; background: rgba(0, 0, 0, .5); overflow-y: auto; height: 80%; color: white; font-size: 1.3em; text-indent: 1em; padding: .8em; border: none; resize: none; outline-color: white; } ::-webkit-scrollbar { width: 10px; background-color: rgba(255, 255, 255, 0.2); } /*定义滚动条轨道 内阴影+圆角*/ ::-webkit-scrollbar-track { border-radius: 5px; background-color: transparent; } /*定义滑块 内阴影+圆角*/ ::-webkit-scrollbar-thumb { border-radius: 10px; background-color: rgba(255, 255, 255, 0.7); } .btn { width: 70%; height: 40px; transition: 0.4s; border-bottom-left-radius: 1em; border-bottom-right-radius: 1em; border: 1px solid rgba(255, 255, 255, 0); border-top: none; z-index: 66; background: rgba(255, 255, 255, 0.2); padding: 5px 0px; } .btn #btnsub:hover { border: 1px solid rgba(255, 255, 255) !important; color: white; } .btn #btnsub { width: 40%; height: 100%; line-height: 25px; display: inline-block; border: 1px solid #009ffd; color: #fff; background: rgba(255, 255, 255, 0.4); border-radius: 4px; font-size: 1.1em; cursor: pointer; } #select { margin: auto; margin-top: 10px; width: 70%; height: 30px; border-top-left-radius: 1em; border-top-right-radius: 1em; transition: 0.4s; border-top: none; z-index: 66; background: rgba(255, 255, 255, 0.2); } #selectbox { width: 70%; height: 30px; margin: 0 auto; } #mantext, #womantext { font-size: 1.25em; line-height: 30px; float: left; } #mantext { text-align: center; } #woman {} .inputbox { width: 50%; height: 30px; float: left; } #man, #woman { margin-top: 6px; display: block; float: left; width: 20px; height: 20px; } </style> </head> <body> <div id="error">请在此输入文本内容</div> <div id="context"> <div class="intro"> <form method="post" enctype="multipart/form-data" id="voice"> <div class="cent_bg"> <h3>请输入要合成的语音文本:</h3> <textarea id="content" class="cent" name="text"></textarea> </div> <div id="select"> <div id="selectbox"> <div class="inputbox"> <label id="mantext" for="man" style="float: right;">男声</label><input id="man" type="radio" name="voicetype" checked value="1" style="float: right;" /> </div> <div class="inputbox"> <input id="woman" type="radio" name="voicetype" value="2" /><label id="womantext" for="woman">女声</label> </div> </div> </div> <div class="btn"> <button id="btnsub" type="submit">合成语音</button> </div> </form> </div> </div> </body> <script> $(function() { var cor=0; var stop=setinterval(function(){ $('#error').fadetoggle(700); cor++; if(cor==3){ clearinterval(stop); } },700); $('#btnsub').on('click', function() { cor=0; var text = $('#content').val(); if(text.length == 0) { var stop=setinterval(function(){ $('#error').fadetoggle(700); cor++; if(cor==4){ clearinterval(stop); } },700); return false; } }); // 服务器请求地址 $('#voice').attr('action', basepath+"/voicegen"); }); </script> </html>
ai.js
// 服务器主地址 var basepath="127.0.0.1:8080/aiproject" // 背景效果 $(document).ready(function() { $('#context').particleground({ dotcolor: '#5cbdaa', linecolor: '#5cbdaa' }); $('.intro').css({ 'margin-top': -($('.intro').height() / 2) }); });
style.css
/*********css初始化*********/ html, body, div, span, applet, object, iframe, h1, h2, h3, h4, h5, h6, p, blockquote, pre, a, abbr, acronym, address, big, cite, code, del, dfn, em, img, ins, kbd, q, s, samp, small, strike, strong, sub, sup, tt, var, b, u, i, center, dl, dt, dd, ol, ul, li, fieldset, form, label, legend, table, caption, tbody, tfoot, thead, tr, th, td, article, aside, canvas, details, embed, figure, figcaption, footer, header, hgroup, menu, nav, output, ruby, section, summary, time, mark, audio, video { margin: 0; padding: 0; border: 0; font-size: 100%; font: inherit; vertical-align: baseline; } article, aside, details, figcaption, figure, footer, header, hgroup, menu, nav, section { display: block; } body { line-height: 1; } ol, ul { list-style: none; } blockquote, q { quotes: none; } blockquote:before, blockquote:after, q:before, q:after { content: ''; content: none; } table { border-collapse: collapse; border-spacing: 0; } /* particleground demo */ *{ -webkit-box-sizing: border-box; -moz-box-sizing: border-box; box-sizing: border-box; } html, body { width: 100%; height: 100%; /*overflow: scroll;*/ } /*********css初始化结束*********/ body { background: #202aa3; font-family: 'montserrat', sans-serif; color: #fff; line-height: 1.3; -webkit-font-smoothing: antialiased; } #particles { width: 100%; height: 100%; overflow: hidden; } .intro { position: absolute; left: 0; top: 50%; padding: 0 20px; width: 100%; text-align: center; } h1 { text-transform: uppercase; font-size: 85px; font-weight: 700; letter-spacing: 0.015em; } h1::after { content: ''; width: 60%; display: block; background: #fff; height: 10px; margin: 30px auto; line-height: 1.1; } p { margin: 0 0 30px 0; font-size: 24px; } .btn { display: inline-block; padding: 15px 30px; border: 2px solid #fff; text-transform: uppercase; letter-spacing: 0.015em; font-size: 18px; font-weight: 700; line-height: 1; color: #fff; text-decoration: none; -webkit-transition: all 0.4s; -moz-transition: all 0.4s; -o-transition: all 0.4s; transition: all 0.4s; } .btn:hover { color: #005544; border-color: #005544; } @media only screen and (max-width: 1000px) { h1 { font-size: 70px; } } @media only screen and (max-width: 800px) { h1 { font-size: 48px; } h1::after { height: 8px; } } @media only screen and (max-width: 568px) { .intro { padding: 0 10px; } h1 { font-size: 30px; } h1::after { height: 6px; } p { font-size: 18px; } .btn { font-size: 16px; } } @media only screen and (max-width: 320px) { h1 { font-size: 28px; } h1::after { height: 4px; } }
接口规范
数据交互类型:json
请求数据:请求数据除了请求参数以外,还需另外发送以下参数:(否则会返回403状态码)
返回数据格式:
{"status": "200","msg":"","data": {"namename":"user","password":"password"}}
(1)人脸识别
接口名:facedetect
请求参数:
返回参数:
(2)人脸比对
接口名:facematch
请求参数:
返回参数:
(3)语音识别
接口名:voicerecognize
请求参数:
返回参数:
(4)语音生成
接口名:voicegen
返回参数:
mp3音频格式文件
请求注意事项
请求体格式化:content-type为application/json,通过json格式化请求体。
base64编码:请求的图片需经过base64编码,图片的base64编码指将图片数据编码成一串字符串,使用该字符串代替图像地址。您可以首先得到图片的二进制,然后用base64格式编码即可。需要注意的是,图片的base64编码是不包含图片头的,如data:image/jpg;base64,
图片格式:现支持png、jpg、jpeg、bmp,不支持gif图片
实例代码
1. 人脸识别实例代码
// 配置请求参数 hashmap<string, string> options = new hashmap<string, string>(); options.put("face_field", "age,gender,glasses,beauty,expression"); options.put("max_face_num", "2"); options.put("face_type", "live"); // 转换成base64 string image = base64util.part2base64(imagepart); string imagetype = "base64"; // 接口调用,并返回json数据 jsonobject json = client.detect(image, imagetype, options); // 响应数据处理 map<string, object> map = new hashmap<>(); // 获取人脸信息列表 jsonobject result = json.getjsonobject("result").getjsonarray("face_list").getjsonobject(0); // 响应数据:性别 jsonobject genderobj = result.getjsonobject("gender"); string genderstr = genderobj.getstring("type"); if(genderobj.getdouble("probability") >= 0.6) {//概率并转换 if("female".equals(genderstr)) { genderstr = "女"; }else if("male".equals(genderstr)) { genderstr = "男"; } } map.put("gender", genderstr); // 返回接口数据 return responsedata.success(map);
2. 人脸对比实例代码
// 转换成base64 string image1 = base64util.part2base64(imagepart1); string image2 = base64util.part2base64(imagepart2); // 封装平台接口请求对象 matchrequest req1 = new matchrequest(image1, "base64"); matchrequest req2 = new matchrequest(image2, "base64"); arraylist<matchrequest> requests = new arraylist<matchrequest>(); requests.add(req1); requests.add(req2); // 人脸匹配 jsonobject json = client.match(requests); // 响应数据处理 map<string, object> map = new hashmap<>(); // 匹配分值 double score = json.getjsonobject("result").getdouble("score"); return responsedata.success(map);
3. 语音识别实例代码
// 文件类型 string filetype = voicepart.getcontenttype(); if(filetype.endswith("mp3")) { filetype = mp3; }else if(filetype.endswith("wav")) { filetype = wav; }else { return responsedata.fail("请上传mp3、wav音频"); } try { // 获取音频字符流 inputstream is = voicepart.getinputstream(); // 保存临时音频文件 string filename = new simpledateformat("yyyymmddhhmmsssss").format(calendar.getinstance().gettime()); file tmpvoice = new file(workspace + file.separator + filename+filetype); voiceutil.savevoicefile(is, tmpvoice); if(filetype.equals(mp3)) { file mp3file = tmpvoice; tmpvoice = new file(tmpvoice.getpath().replace(mp3, wav)); if(!voiceutil.mp3towav(mp3file, tmpvoice)) { return responsedata.fail("mp3音频文件错误,请用wav音频。"); } mp3file.delete(); } // 调用百度接口 jsonobject json = client.asr(tmpvoice.getpath(), "wav", 16000, null); tmpvoice.delete(); integer status = json.getint("err_no"); //状态码 if(status != 0) { // 异常响应处理 string msg = json.getstring("err_msg"); log.warn("百度接口调用响应异常,error_code:"+status + " error_msg:"+msg); return responsedata.fail(msg); } // 响应数据处理 map<string, object> map = new hashmap<>(); // 获取结果 jsonarray jsonarray = json.getjsonarray("result"); // 识别文本 string text = jsonarray.getstring(0); map.put("text", text); return responsedata.success(map); } catch (exception e) { log.warn("识别音频文件错误:", e);; }
4. 语音合成实例代码
// 请求参数 hashmap<string, object> options = new hashmap<string, object>(); // 语速,取值0-9,默认为5中语速 options.put("spd", "5"); // 音调,取值0-9,默认为5中语调 options.put("pit", "5"); // 发音人选择, 0为女声,1为男声, 3为情感合成-度逍遥,4为情感合成-度丫丫,默认为普通女 options.put("per", voicetype); // 调用百度api接口 ttsresponse res = client.synthesis(text, "zh", 1, options); byte[] data = res.getdata(); return data;
效果展示
源码下载地址:https://github.com/jcdjor/aiproject
ps:欢迎大家给予评论、建议和下载学习,下面问源码的一些说明
- aiproject.zip 为后端代码,为eclipse项目,app.properties文件需要自己配置百度云开发平台的appid、apikey、secretkey。
- web.zip 为前端代码,前后端分离,可直接运行使用 。
- 运行环境,对版本没太大要求,但jdk和tomcat要对应 jdk:,我使用的版本为jdk1.8,
tomcat:,我使用的版本为tomcat 9,;- 注意需要百度云开放平台的appid、apikey、secretkey,百度云ai开放平台:http://ai.baidu.com/
声明:本文欢迎大家评论和转载,使用本文章或代码还请声明,且在使用处的明显位置给出。如有其它问题或有什么建议,可在下方评论,或加qq(1414782205),或发邮箱jcdjor@163.com。
上一篇: 高淇java300集异常机制作业
下一篇: 七夕组CP的真的不差你们俩